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CHAPTER1 OVERVIEW 


This technical report provides an overview of the New York State Alternate Assessment 
(NYSAA), including a description of the purpose of the NYSAA, the processes utilized to develop and 
implement the NYSAA program, and Stakeholder involvement in those processes. By comparing the 
intent of the NYSAA with its process and design, the validity of the assessment’s use can be evaluated. 
Starting with the 2013-14 NYSAA, a new test design was implemented for all content areas. For the 
2013-14 NYSAA, new English Language Arts (ELA) and mathematics Extensions, which are aligned to 
the Common Core Learning Standards (CCLS), were developed. For science and social studies, 
Alternate Grade-Level Indicators (AGLIs) aligned to the New York State Learning Standards, 
development occurred in 2006-07 and 2007-08. The processes for developing the Extensions and 
AGLIs are presented in detail. Stakeholder input in the development of the overall NYSAA process itself 
is described in detail, including the content alignment of the Extension and AGLI design following the 
new Blueprint/test design, the Assessment Task development, the teacher trainings for administration, 


the Scoring Trainings and process, and the standard setting. 


1.1 PURPOSE OF THIS REPORT 


The purpose of this report is to document the technical aspects of the 2014—15 NYSAA. During 
the 2014—15 school year, approximately 22,226 students in Grades 3 through 8 and in high school 
were administered the NYSAA. ELA and mathematics were assessed at the Grades 3 through 8 and 
high school levels; science was assessed at the Grades 4 and 8 and high school levels; and social 
studies was assessed at the high school level. 

Several technical aspects of the NYSAA are described in an effort to contribute to evidence 
supporting the validity of NYSAA score interpretations. Because the interpretations of the test scores 
are evaluated for validity, not the test itself, this report presents documentation to substantiate intended 
interpretations (AERA et al., 2014). Each chapter in this section contributes important information 
regarding the test’s validity by addressing one or more of the following aspects of the NYSAA: 
Extension and AGLI and Assessment Task development, alignment, administration, scoring, reliability, 
standard setting, and achievement levels. 

Standards for Educational and Psychological Testing (AERA et al., 2014) provides a framework 
for describing sources of evidence that should be considered when constructing an argument for 
assessment validity. These sources are found in five general areas: test content, response processes, 
internal structure, relationship to other variables, and consequences of testing. Although each of these 
sources may speak to a different aspect of validity, they are not distinct types of validity. Instead, each 


contributes to a body of evidence about the comprehensive validity of score interpretations. 
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1.2 ORGANIZATION OF THIS REPORT 


This report is organized based on the conceptual flow of the NYSAA as a multi-year-long 
process, which includes Blueprint design/development (completed in 2012-13), Extension (completed 
in 2012-13) and AGLI (completed in 2006-07 and 2007-08) development, Assessment Task 
development (completed in 2012-13), administration (completed in 2014—15), scoring (completed in 
2014—15), standard setting (completed in 2013-14), technical characteristics, and validity. The 


appendices contain supporting documentation. 


1.3 CURRENT YEAR UPDATES 


The NYSAA was redesigned, beginning with the 2013-14 administration. There were no 
changes from the 2013—14 administration to the 2014—15 administration. The overall structure of the 
NYSAA remains consistent with past practice in that it is a datafolio-style assessment that includes 
student performance data and evidence, and it is designed to assess students with severe cognitive 


disabilities. 
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CHAPTER2 THE STATE ASSESSMENT SYSTEM 


In New York State, both the general large-scale assessments and the alternate assessment test 
students on English Language Arts (ELA) and mathematics curriculum content taught during Grades 3 
through 8 and high school; on science content taught during Grades 4 and 8 and high school; and on 
social studies content taught during high school. All students participate in the statewide assessment 
program through the following: the general assessments, with or without accommodations; the alternate 
assessment, with or without accommodations; or a combination of the general and alternate 


assessments, with or without accommodations. 


2.1 INTRODUCTION 


The New York State Alternate Assessment (NYSAA) is designed to provide a snapshot of an 
individual student's performance. A broader picture will emerge as the student results on the NYSAA 
are reviewed, along with results on other classroom and district assessments. 

The NYSAA is a datafolio-style assessment that measures how well students with severe 
cognitive disabilities meet the standards at alternate achievement levels. All students, including those 
with severe cognitive disabilities, are required by federal law to have access to the general education 
curriculum. The New York State Education Department (the Department) has aligned Extensions with 
Common Core Learning Standards (CCLS) in ELA and mathematics, and Alternate Grade-Level 
Indicators (AGLIs) with the core curriculum in science and social studies, for the administration of the 
NYSAA. The content-area subject matter assessed by the NYSAA is clearly related to the grade-level 
content. While the content is reduced in scope and complexity, students with severe cognitive 
disabilities are held to high expectations in order to achieve the standards. The Extensions and AGLIs 
afford students a richer learning experience. 

School districts across the United States are required to assess all students according to federal 
statute and state regulations. Assessment results tell educators how students are progressing and 
signal where changes may need to be made in the curriculum and/or instruction at the district, school, 
and classroom levels. Teachers should assess students in all areas (academic, social, etc.) on an 
Ongoing basis, as part of the instruction cycle. 

The No Child Left Behind (NCLB) Act of 2001 and the NYSAA are, in part, designed to raise 
expectations for students’ academic achievement. Students with severe cognitive disabilities, when 
given the appropriate instruction and access to the general education curriculum, have demonstrated 
progress in their Knowledge, skills, and understanding in academic content areas that were not initially 


anticipated by school personnel or parents. Higher expectations require that students with severe 
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cognitive disabilities have access to the general education curriculum and be provided with specialized 


instruction, as well as participate in national, state, and local assessment programs. 
The administration period for the 2014-15 NYSAA was September 29, 2014 through February 
27, 2015. The scoring period for the 2014-15 NYSAA was March 16, to May 1, 2015. The general 


sequence of events for administering the NYSAA is summarized below. 


Summary of NYSAA Events 


a. 


Each student’s Committee on Special Education (CSE) determines how a student 
participates in the New York State Testing Program. The CSE uses the Department's 
guidelines regarding eligibility and participation criteria to guide their decision-making. 


For each content area assessed, the student’s instructional team, headed by the Lead 
Special Education Teacher (teacher), considers the most appropriate level of complexity 
for the student in each content area assessed. Five Extensions are required in ELA and 
mathematics, and two AGLIs are required in science and social studies. 


Parents/guardians meet with the teacher to discuss how the NYSAA is administered and 
which specific Extensions and AGLIs will be used to assess their child. 


Members of the student’s instructional team conduct the baseline data point early in the 
administration period and document and rate student performance. Based on the results 
of the baseline assessment, the teacher will determine whether it is necessary to select 
another. The baseline data point serves two purposes: first, to confirm that the 
appropriate Level of Complexity has been selected and, second, to confirm that the 
student has not already mastered the selected skill. The baseline score cannot be higher 
than 74%. 


Once the baseline administration confirms the task to be assessed, the instructional 
team provides instruction on the assessed skill, continuing to evaluate student progress 
until it appears that the student has acquired the skill. 


Following the instructional period, a final data point is administered and scored for Level 
of Accuracy. The date of the final data point should not be less than 15 school days after 
the date of the baseline data point, and should occur as close to the end of the 
administration period as possible (no later than February 27, 2015). Similar items and 
materials should be used for both the baseline and final administrations. 


The teacher assembles a datafolio containing the evidence of student performance and 
the percentages of the student’s Level of Accuracy. The completed datafolio is submitted 
to the building administrator on or before the last day of the administration period, who 
then ships it to the regional Scoring Institute. 


The NYSAA datafolios are scored at regional NYSAA Scoring Institutes during the 
scoring period defined by the Department. 


Student reports are created and are made available to school districts, teachers, and 
parents/guardians. 
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2.2 ALTERNATE ASSESSMENT BASED ON ALTERNATE ACHIEVEMENT STANDARDS 


Up to 1% of New York State students in the grades tested may show academic proficiency 
through administration of an alternate assessment that is based on alternate achievement standards. 
The NYSAA is designed for those students with such severe cognitive disabilities that they are unable, 
even with the best instruction and appropriate accommodations, to participate in a general New York 
State assessment. The NYSAA is designed under the guiding philosophy that alternate achievement 
standards are built upon measurable, targeted skills linked to the CCLS in ELA and mathematics and to 
the core curriculum’s Grade-Level Indicators in science and social studies. However, the alternate 
achievement standards represent student performance at lower levels of breadth, depth, and 


complexity than those found in the general assessments. 


2.3 THE ALTERNATE ASSESSMENT SYSTEM 


The Individuals with Disabilities Education Act of 1997 (IDEA of 1997) requires that students 
with disabilities be included in each state’s system of accountability and have access to the general 
curriculum. The federal reauthorization of the Elementary and Secondary Education Act, known as the 
NCLB Act of 2001, also speaks to the inclusion of all children in a state’s accountability system by 
requiring states to report achievement for all students, as well as for groups of students on a 
disaggregated basis. These federal laws reflect an ongoing concern about equity: All students need to 
be academically challenged and taught to high standards. It is also necessary that all students be 
involved in the educational accountability system. Alternate achievement standards are reduced in 
breadth, depth, and complexity, but are linked to the same general curriculum standards taught to all 
students. 

The IDEA of 1997 and the NCLB Act of 2001 clearly outline that all students, regardless of 
disability, participate in a statewide assessment system and be held accountable to the state standards. 
The NYSAA was developed to meet the requirements of these federal mandates; to provide a 
technically sound method to observe and record student achievement; to represent the breadth and 
depth of statewide content; to promote access to the general curriculum; to provide critical information 
to the CSE for use in the development of Individualized Education Programs (IEPs); and to meet 
criteria for alignment, access, burden, bias, sensitivity, and age appropriateness for students with 
severe cognitive disabilities. The 2013-14 NYSAA was the first year of administration of the new test 
design linked to the Common Core Learning Standards for ELA and mathematics. This same test 
design was used for the 2014-15 NYSAA administration. 
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2.4 PURPOSE OF THE ALTERNATE ASSESSMENT SYSTEM 


The NYSAA measures the achievements of students with severe cognitive disabilities relative to 
the New York State Learning Standards, using alternate achievement levels based on a datafolio 
approach (as described in the next section). To ensure that this student population has access to the 
general education curriculum, for the NYSAA administration, the Department aligned the Extensions 
and AGLIs (discussed in the next section) with the CCLS in ELA and mathematics and with the core 
curriculum’s Grade-Level Indicators in science and social studies. 

The NYSAA is, in part, designed to raise expectations for students’ academic achievement. 
Experience has shown that students with severe cognitive disabilities, when given appropriate 
instruction and access to the general education curriculum, demonstrate unanticipated progress in their 
knowledge, skills, and understanding in academic content areas. Prior to 2006-07, access to the 
general education curriculum was not necessarily a part of instructional programs for students with 
severe cognitive disabilities. In a recent survey of teachers who administered the NYSAA in 2014—15, 
56% agreed that the Extensions and AGLIs assessed in the NYSAA made the grade-level core 
curricula more accessible, and said that the AGLIs are used in planning daily instruction. 

The process for assessing the academic achievements of students who have severe cognitive 
disabilities and who are eligible for the NYSAA is outlined through structured guidelines and steps in 
the 2014-15 NYSAA Administration Manual (accessible at 
http:/Awww.p12.nysed.gov/assessment/nysaa/nysaa-manual-15.html). The process of datafolio 
development (see Chapter 7) supports the procedural validity of assessing students with severe 
cognitive disabilities, while being flexible enough to meet each individual student’s learning needs and 


modalities. 


2.5 TEST USE AND DECISIONS BASED ON ALTERNATE ASSESSMENT 


New York State conducts a statewide assessment program on an annual basis for all students 
in Grades 3 through 8 and high school. The NYSAA ensures that students with severe cognitive 
disabilities are included in the New York State Testing Program and that their results are included in all 
Adequate Yearly Progress (AYP) determinations. 

The NYSAA is a datafolio-style assessment based on the assessment of Extensions and AGLIs. 
A datafolio is a collection of evidence of a student’s academic performance, which is compiled by the 
student’s instructional team and scored by qualified Scorers. By gathering performance data, the 
instructional team can provide parents/families/guardians and the CSE with an understanding of the 
student’s knowledge, skills, and understanding as they relate to the CCLS in ELA and mathematics, 
and the New York State core curriculum in science and social studies. The CSE can use the datafolio 


to understand the student’s achievement relative to these standards and to contribute to the 


2014-15 NYSAA Technical Report: Chapter 2—The State Assessment System -6- 


development of the student’s IEP. Datafolios are scored during a standardized scoring period each 
spring. The NYSAA student reports are generally available in the fall following administration. 
Performance levels, based on alternate academic achievement standards, were developed 
through a rigorous standard-setting process in June 2014. Alternate Performance Level Descriptors 
(APLDs) that outline the knowledge, skills, and understanding that a student may demonstrate within 
each grade and content area were edited and refined by panelists during the standard-setting process. 
The APLDs and datafolios, provide information to parents/families/guardians, the CSE, and the 
instructional team regarding potential modifications or adjustments to the student's instructional 


program. 


2.6 BACKGROUND AND GENERAL FORMAT 


A datafolio is a collection of evidence of a student's academic performance compiled by the 
student's instructional team and scored by qualified Scorers. Instructional team members document 
student performance by recording the student’s Level of Accuracy percentage as he or she performs an 
Assessment Task on two different dates, a baseline data point and a final data point, within the 
administration period. To verify this documentation, each datafolio must include student work products, 
Data Collection Sheets, photographs, or digital video and/or audio recordings. Teachers complete the 
required forms and submit all documentation and evidence in a binder or fastened folder for regional 
scoring. 

Teachers are provided with a NYSAA Administration Manual that outlines the assessment 
requirements and the steps for compiling a datafolio, and includes the documentation forms and the 
NYSAA Frameworks as appendices. The NYSAA Frameworks include an introduction, and the NYSAA 
Test Blueprints outline the standards that will be assessed via the alternate assessment for each grade. 
The Test Blueprints illustrate, for each content area (e.g., ELA, mathematics, science, and social 
studies), the major areas of the standards’ focus that teachers must assess at each grade. For both 
ELA and mathematics, five standards are assessed for each student; in science and social studies, two 
standards are assessed. In ELA and mathematics, teachers select an Extension from one of three 
Levels of Complexity, based on their students’ needs. For students taking the NYSAA in science 
(Grades 4 and 8 and high school) and social studies (high school), teachers select an AGLI from one of 
three Levels of Complexity. 

Teachers must identify one Extension or AGLI, based on the student’s assessed grade level. 
The grade level corresponds to the student’s date of birth. For each Extension or AGLI, the teacher 
must collect and document student performance data from an Assessment Task administered on two 
separate dates—the baseline and final. Once the baseline is administered, the teacher continues to 
provide instruction and evaluate student progress until he or she reaches a point where performance 


plateaus or the end of the administration period occurs. At this point, the final administration is 
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conducted. The administration guidelines recommend that there be at least 15 school days between the 
baseline and final administrations, which allows for instruction and for the student to learn the new skill. 
More than 15 days is acceptable, as long as the final administration takes place prior to the end of the 
administration period. One piece of verifying evidence must be submitted to demonstrate the student 


performance for each of the documented data points (baseline and final). 


2.7 TESTING ACCOMMODATIONS 


The CSE determines whether a student will participate in the alternate assessment with or 
without accommodations. Guidelines regarding accommodations are provided in the NYSAA 
Administration Manual. The CSE determines which testing accommodations are required, based on the 


student’s documented needs. Testing accommodations: 


=" are consistent with the student’s IEP; 


" are designed to allow the student to demonstrate his or her knowledge, skills, and 
understanding with greater independence; 


= do not change the level of the assessment, the construct of the assessment, or the 
criteria of the Assessment Task; and 


"are provided to the student during instruction and not just for assessment. 


For more information on testing accommodations, refer to Test Access and Accommodations for 


Students with Disabilities: Policy and Tools to Guide Decision-Making and Implementation (May 2006) 


at www.p12.nysed.gov/specialed/publications/policy/testaccess/policyquide.htm. 
Frequently asked questions about testing accommodations and the NYSAA can be found at 


www.p12.nysed.gov/assessment/nysaa/home.html. 
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CHAPTER3 THE STUDENTS 


New York State conducts a statewide testing program on an annual basis for all students in 
Grades 3 through 8 and high school. The New York State Alternate Assessment (NYSAA) is a part of 
this statewide testing program. Designed for students with severe cognitive disabilities, the NYSAA 
measures student progress toward meeting the learning standards established for all students in the 
academic content areas of English Language Arts (ELA), mathematics, science, and social studies. 
The NYSAA ensures that students with severe cognitive disabilities are included in the State Testing 
Program (NYSTP) and that their results are accounted for as required by the No Child Left Behind 
(NCLB) Act of 2001 and the Individuals with Disabilities Education Act (IDEA) of 1997. 


3.1 TARGET POPULATION 


The target population for the NYSAA is extremely specific, and participation is limited to 
students with severe cognitive disabilities. The eligibility and participation criteria provide a definition of 
a student with a severe disability in accordance with Section 100.1 of the Regulations of the 
Commissioner of Education. For reference, this information is provided in the NYSAA Administration 
Manual and on the Web site of the New York State Education Department (the Department). 

“Students with severe disabilities” refers to students who have limited cognitive abilities, 
combined with behavioral and/or physical limitations, and who require highly specialized educational 
and/or social, psychological, and medical services in order to maximize their full potential for useful and 
meaningful participation in society, and for self-fulfillment. Students with severe disabilities may 
experience severe speech, language, and/or perceptual-cognitive impairments and challenging 
behaviors that interfere with learning and socialization opportunities. These students may also have 
extremely fragile physiological conditions and may require personal care, physical/verbal supports, and 
assistive technology devices. 

The process of determining eligibility begins with the Committee on Special Education (CSE). 


The CSE determines, on an individual basis, whether the student will participate in: 


« the State’s general assessment, with or without accommodations; 
= the State’s alternate assessment, with or without accommodations; or 


" acombination of the State’s general assessment for some content areas and the State’s 
alternate assessment for other content areas. 
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The CSE ensures that decisions regarding participation in the State Testing Program are not 


based on: 


"category of disability, 
» language differences, 
=" excessive or extended absences, or 


= cultural or environmental factors. 


The CSE also ensures that each student has a personalized system of communication that addresses 
his or her needs regarding disability, culture, and native language so that the student can demonstrate 
his or her present level of performance. Tests and other assessment procedures are conducted 
according to the requirements of Section 200.4(b)(6) of the Regulations of the Commissioner of 
Education and Section 300.320(a)(6) of the Code of Federal Regulations. 

Only students with severe cognitive disabilities are eligible for the NYSAA. The CSE determines 
whether a student with a severe cognitive disability is eligible to take the NYSAA, based on the 
following criteria: 

« the student has a severe cognitive disability and significant deficits in 
communication/language and significant deficits in adaptive behavior; and 


» the student requires a highly specialized educational program that facilitates the 
acquisition, application, and transfer of skills across natural environments (home, school, 
community, and/or workplace); and 


» the student requires educational support systems, such as assistive technology, 
personal care services, health/medical services, or behavioral intervention. 


While the State Testing Program provides full access to all students, 1% of students with severe 
cognitive disabilities in Grades 3 through 8 and high school are alternately assessed and are counted 
as proficient for purposes of accountability. 

In accordance with 34 CFR 200.13 Adequate Yearly Progress in General, there is a 1% cap on 
the number of proficient and advanced scores on the alternate assessment that may be included in 


Adequate Yearly Progress (AYP) calculations at both the State and district levels. 


3.2 SUMMARY OF PARTICIPATION RATES 


Tables 3-1 through 3-4 show a summary of participation in the 2014-15 NYSAA by 


demographic category for each content area. 
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Table 3-1. 2014-15 NYSAA: Summary of Participation—English Language Arts 


Number Percent 


Demographic Group Tested Participation 


All Students 21,710 100.00 
Male 14,773 68.05 
Female 6,937 31.95 
American Indian/Alaskan Native 190 0.88 
Black 5,407 24.91 
Asian 1,176 5.42 
Hispanic 5,842 26.91 
White 8,779 40.44 
Native Hawaiian/Other Pacific Islander 85 0.39 
Multi 231 1.06 


Table 3-2. 2014-15 NYSAA: Summary of Participation—Mathematics 


Number Percent 


Demographic Group Tested Participation 


All Students 21,716 100.00 
Male 14,771 68.02 
Female 6,945 31.98 
American Indian/Alaskan Native 190 0.87 
Black 5,409 24.91 
Asian 1,177 5.42 
Hispanic 5,841 26.90 
White 8,784 40.45 
Native Hawaiian/Other Pacific Islander 84 0.39 
Multi 231 1.06 


Table 3-3. 2014-15 NYSAA: Summary of Participation—Science 


Number Percent 


Demographic Group Tested Participation 


All Students 9,231 100.00 
Male 6,183 66.98 
Female 3,048 33.02 
American Indian/Alaskan Native 74 0.80 
Black 2,324 25.18 
Asian 492 5.33 
Hispanic 2,364 25.61 
White 3,847 41.67 
Native Hawaiian/Other Pacific Islander 39 0.42 
Multi 91 0.99 
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Table 3-4. 2014-15 NYSAA: Summary of Participation—Social Studies 
Number Percent 


Pemegiapnre Group Tested Participation 
All Students 2,886 100.00 
Male 1,875 64.97 
Female 1,011 35.03 
American Indian/Alaskan Native 26 0.90 
Black 731 25.33 
Asian 141 4.89 
Hispanic 657 22.77 
White 1,298 44.98 
Native Hawaiian/Other Pacific Islander 10 0.35 
Multi 23 0.80 
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CHAPTER4 TEST DEVELOPMENT 


4.1 FRAMEWORKS OF THE TESTING PROGRAM 


The New York State Common Core Learning Standards (CCLS) provide the framework for the 
New York State Testing Program in English Language Arts (ELA) and mathematics. The State’s core 
curriculum learning standards provide the framework for the New York State Testing Program in 
science and social studies. Each statewide assessment program has a Test Blueprint that outlines the 
priorities to be assessed based on the grade-level learning standards. The redesign of the New York 
State Alternate Assessment (NYSAA), which began with the 2013-14 administration, was done in 
response to changes that the New York State Education Department (the Department) made to the 
general education assessments to assess the CCLS in ELA and mathematics. The general education 
assessment Blueprints were used as the basis for the development of the alternate assessment Test 
Blueprints, which, in turn, drove the alternate assessment content. There is one alternate assessment 
Blueprint for each of the four content areas assessed (See Appendix A). 

In May 2012, the Department assembled teacher committees to review the Test Blueprints for 
ELA and mathematics. The group’s goal was to develop Essences and Extensions for each standard. 
Groups focused on designing Extensions that aligned to general education grade-level content and, 
most importantly, were appropriate for students with severe cognitive disabilities. The draft Essences 
were reviewed by the Department and Measured Progress, and then posted on September 10, 2012 for 
public comment through October 5, 2012. A total of 852 respondents began the survey and 66.7% 
completed the survey. To the greatest extent possible, feedback collected from the survey was 
incorporated into the Essence and Extension documents. 

In October 2012, the groups were reassembled with the purpose of developing Assessment 
Tasks for the approved Extensions. The draft Assessment Task documents were posted on December 
7, 2012 for public comment through January 4, 2013. A total of 1,026 respondents began the survey 
and 60.3% completed the survey. To the greatest extent possible, feedback collected from the survey 
was incorporated into the Assessment Task documents. 

The Department followed a similar process in fall 2006, when it assembled special education 
and general education teacher committees to review the core curricula and general education 
assessment Blueprints for science and social studies. This group’s goal was to determine academic 
content priorities for the NYSAA, based on the core curricula, general education assessment 


Blueprints, and, most importantly, applicability for students with severe cognitive disabilities. The 
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process was designed to ensure alignment with general education grade-level content and to promote 
higher expectations for students taking the NYSAA. 

The special education and general education teacher committees’ discussions focused on the 
actual depth and breadth of the alternate assessment requirements. Throughout the review, 
psychometricians from the Department and Measured Progress provided direction for maintaining a 
valid and reliable assessment. The resulting work by the special education and general education 
teacher committees expanded the standards for students with severe cognitive disabilities and created 
Extensions for ELA and mathematics, and Alternate Grade-Level Indicators (AGLIs) for science and 
social studies. The Extensions and AGLIs provide entry points to the grade-level content of the 
standards so that a student’s level can be gauged in terms of the standards established for all students 
by the New York State Board of Regents. 

The Test Blueprints, CCLS and core curriculum standards, Essences, Extensions and AGLIs, 
and Assessment Tasks for each grade can be found in the 2014—15 NYSAA Administration Manual: 


Appendix F—NYSAA Frameworks (http://www.p12.nysed.gov/assessment/nysaa/nysaa-manual- 
15.html). 


4.2 EXTENSIONS AND AGLIs MAPPED TO NYS LEARNING STANDARDS AND CORE 
CURRICULUM BY GRADE 


The Extensions are aligned to the State’s CCLS, and AGLIs are aligned to the New York State 
learning standards. Both the Extensions and AGLIs reflect high expectations for students with severe 
cognitive disabilities. This alignment is illustrated in Figure 4-1. 

For the Extensions, a teacher committee meeting was held in October 2012 to review the new 
Test Blueprints for ELA and mathematics, and to develop Essences and Extensions. A second meeting 
was held in May 2013 for the purpose of expanding the Extensions into Assessment Tasks. 

For the AGLIs, teacher committee meetings were held during the summer and early fall of 2006 
to gather input on aligning the NYSAA requirements with Grade-Level Expectations and on developing 
AGLIs. Additionally, teacher committee meetings were held in spring 2007 and 2008 to further refine 
the AGLIs and to develop additional Assessment Tasks for teachers to use in the alternate 
assessment. As part of the overall redesign implemented in 2013-14, the science and social studies 
AGLIs were narrowed and the Assessment Tasks were updated to follow the same format and 
philosophical approaches as the Extensions and Assessment Tasks in ELA and mathematics. 

The Board of Regents approved a set of learning standards to guide instruction and 
assessment. The learning standards serve as the basis of the core curricula in science and social 


studies. The curriculum of each content area is divided into the following components: 


= science: standards and key ideas 
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=" social studies: standards and units 


Each component in a content area lists Grade-Level Expectations for student performance. These 
expectations are called grade-level performance indicators, or content understandings. 

Grade-Level Expectations are further distilled into essences. Essences are the “big ideas” of the 
Grade-Level Expectations for a grade. Assessment is based on the Essences for each component of 
each content area. AGLIs are aligned to the Essences in terms of three Levels of Complexity. 

The Board of Regents approved the New York State P-12 CCLS for ELA and mathematics. The 
New York State Learning Standards serve as the basis of the core curricula in ELA and mathematics. 


The curriculum of each content area is divided into the following components: 


=» ELA: strand and sub-strands 


= mathematics: domain 


Each component in a content area lists Grade-Specific Standards, which are further distilled into 
Essences. Assessment is based on the Essences for each component of each content area. 


Extensions are aligned to the Essences in terms of three Levels of Complexity. 


2014-15 NYSAA Technical Report: Chapter 4—Test Development -15- 


Figure 4-1. 2014-15 NYSAA: Mapping of Assessment Tasks to the Learning Standards 


Common Core Common Core NYS Learning NYS Learning 
Learning Learning Standards, Core Standards, Core 
Standards Standards Curriculum Curriculum 


English 
Language Arts Science Social Studies 
(ELA) Mathematics 


Standards Standards 


Domains 


Sub-strands Key Ideas 


Standards 


Grade Level Grade Level 


Standards Indicators Indicators 


Essences 


Essences Essences 


Essences 
5 Extensions 
5 Extensions 


erie Assessment Tasks ff assessment Tasks 
Tasks 


2 Alternate 2 Alternate 
Grade-Level Grade-Level 
Indicators (AGLIs) [ Indicators (AGLIs) 


Assessment Tasks 


2014-15 NYSAA Technical Report: Chapter 4—Test Development - 16- 


4.3 AGLI SELECTION CRITERIA AND PROCESS 


The New York State Board of Regents committed to the Common Core State Standards 
(CCSS) in January 2010 and formally adopted the CCSS for ELA and mathematics in July 2010. The 
New York State P-12 Common Core Learning Standards (CCLS) incorporated State-specific additions 
in January 2011. The Board of Regents announced that, for students with severe cognitive disabilities, 
student progress on the CCLS would be measured beginning with the 2013-14 administration of the 
NYSAA in ELA and mathematics. Beginning in November 2011, the Department and Measured 
Progress began developing a new test design aligned to the CCLS. In this design, new Extensions 
would be developed for ELA and mathematics, and the existing AGLIs would be refined in science and 


social studies. The process for developing the Extensions and previously the AGLIs is outlined below. 


Extensions: 

The Department coordinated the recruitment of teacher committees, which met in May and 
October 2012. Participants were chosen by the Department with the intent that the participants would 
remain consistent across both meetings, which ensured consistency in the overall process and content 
interpretation. 

Participants were assigned to grade-content workgroups. Each group reviewed the CCLS and 
the new Test Blueprints with the purpose of developing Essence statement(s) for each standard being 
assessed. Once the Essences were developed, the panels worked to create three Extensions for each 
standard, one for each Level of Complexity (less, middle, and more complex). The following expected 
outcomes were provided to the work groups: 

1. Each grade-content workgroup will produce a final draft version of an Essence statement 

that addresses the emphasis of the standard (ELA) or cluster (mathematics) and an 
Extension(s) at three complexity levels (less, middle, and more complex) following the draft 


NYSAA Test Blueprint. The Essence and Extensions will be determined by considering: 


e Curricular congruence and alignment; 

e Developmental applicability for students with significant cognitive disabilities and 
datafolio product alignment and feasibility; 

e Applicability to transitional and career readiness skills for students with significant 
cognitive disabilities; and 

e Parental and special populations experiences ensuring consideration of all variations 


of abilities of students with significant cognitive disabilities. 


2. Workgroups will have in-depth discussions between special education and general 


education teacher committees on the standards (ELA) or clusters (mathematics), Essence 
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statements, and Extension statements to develop the final draft version of the Essences and 


Extensions. 


During an opening session facilitated by the Department and Measured Progress, participants 
were welcomed and introduced. An overview of the process and the format of the materials were 
presented. Following the opening session, participants moved into their assigned grade-content work 
groups. Using a standardized template, each group was asked to follow the same basic steps for their 


work. 


Step 1: Introductions and Material Review. 


The participants in each grade-content workgroup introduced themselves and indicated which 
region they were representing. A room facilitator reviewed the expectations for their work and identified 
a note taker to record key points of their discussions and decisions. Participants were asked to 
familiarize themselves with the layout of the CCLS documents and the NYSAA Alignment to CCLS 


template. 


Step 2: Develop Essence statement(s). 

Using the Alignment template as a guide, each workgroup considered the standard being 
assessed and developed one or more Essence statements. These statements narrowed the depth and 
breadth of the content, which enables students with significant cognitive disabilities to access the 


content and demonstrate their knowledge, skills, and understanding. 


Step 3: Develop Extensions aligned to the Essence statement and transition skills. 


Using the Essence statement(s), the workgroups developed three Extensions, which 
represented increasing complexity and cognitive demand. In addition, participants were asked to 


consider the Career Development and Occupational Studies standards and identify links to the CCLS. 


Step 4: Review the group work. 


Within each content area, workgroups shared the Essences and Extensions that they drafted 
regarding the progression of knowledge, skills, and understanding. 

Following the workgroup meetings, an extensive review of the draft documents was conducted 
by content experts from the Department and Measured Progress. During the summer of 2012, the draft 
documents were posted on the Department’s Web site for public comment. Based on the public 
comment, additional work was done on the Essences and Extensions before they were presented to 


the workgroups again in October 2012. 
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AGLIs: 

The Stakeholder groups who met in 2006, 2007, and 2008 were named the NYSAA Revision 
Workgroup (NRWG). The participants who were chosen for the initial group remained throughout all of 
the NRWG meetings, which ensured consistency in the overall process and in content interpretation. 

As part of the implementation of the new test design, the Test Blueprints for science and social 
studies were revised to narrow the content assessed. In addition, minor editorial revisions were made 
to the AGLIs and Assessment Tasks. However, as was the case with the 2012-13 version of the 
NYSAA Frameworks, the intent of the AGLIs was not changed in any way. 

The spring 2008 NRWG process was consistent in science and social studies. The NRWG was 
not allowed to edit or change the Test Blueprints, Grade-Level Expectations, Essences, and intent of 
the AGLIs. As outlined below, for each content area, three steps were followed by the participants, and 


the fourth step was completed afterward by the content developers. 


Step 1: Present the expected outcomes for the workgroup. 


The workgroup was welcomed and thanked for participating in the revision of the NYSAA 
Frameworks. The participants introduced themselves and indicated where they were from and in which 
content area they were participating. The presentation then consisted of directing the groups through 
the materials that they would be working with and explaining the specific tasks for the grade-content 
workgroups, as well as other logistical information. The workgroup was given time for questions and 
then released into their grade-content workgroups, where they remained for the rest of the day and the 


following day. 


Step 2: Review the Frameworks and other materials. 


In order to complete the tasks required in the time allotted, each content area facilitator divided 
participants into groups by grade level and distributed the materials for review. The groups were divided 


as indicated in Table 4-1. 


Table 4-1. 2014-15 NYSAA: NRWG Participant Groups from 2008 


Content Area Group Grades 
1 4 
Science 2 8 
3 High School 
: : 1 5 
Social Studies 5 8, High School 
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Step 3: Complete the work process. 


In all the content area workgroups, the participants reviewed and edited existing Sample 
Assessment Tasks (SATs) and then worked to add new SATs. The process for adding new SATs was 
as follows: The workgroups first focused on AGLIs that did not have SATs. Then they developed 
additional SATs for AGLIs that already had at least one SAT. Throughout the editing and developing of 
SATs, each workgroup worked to ensure alignment to the AGLIs. During the editing process, the 
workgroups also identified words that they felt should be added to the glossary for each content area. 
The tasks within each content area focused on each of the outcomes identified from the revision of the 
NYSAA Frameworks. 


Step 4: Review the group work as a further check on core curriculum alignment. 


Each facilitator gathered each workgroup’s work and reviewed all edits and suggestions, as 
another check on content alignment. The edited NYSAA Frameworks then went to the Department for 
an additional content-alignment check and for finalization of each content area for the 2008—09 
administration of the NYSAA. 


4.4 ASSESSMENT TASK DEVELOPMENT 


In October 2012 the Essence/Extension workgroups were reassembled with the purpose of 
developing Assessment Tasks aligned to the Extensions. Their process was similar to the steps 
followed during the May 2012 meeting. Participants were not allowed to edit or revise the Essences or 
Extensions. Using an updated Alignment template, the groups began with the first standard in their 
assigned grade-content and developed at least one Assessment Task aligned to each Extension. For 
both the Extensions and AGLIs, an Assessment Task describes an observable student action related to 
the specific knowledge, skills, and understanding aligned to the AGLI and, in turn, to the core 
curriculum. 

The 2008 NRWG developed, edited, and refined the original Assessment Tasks aligned to the 
AGLIs. Regional Lead Trainers (RLTs), who were part of the NRWG, provided input on SATs aligned to 
the AGLIs. Teachers had the opportunity to submit assessment tasks for possible inclusion in the 
NYSAA Frameworks through the annual online teacher survey. Information collected during the 2011-— 
12 administration and scoring periods also influenced edits to the SATs. Edited SATs were reviewed 
and approved by the Department for the 2012-13 NYSAA Frameworks. See the following section for 
more information on task development and refer to the NYSAA Administration Manual for information 


provided to teachers regarding Assessment Task requirements. 
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CHAPTER 5 TEST CONTENT 


The New York State Alternate Assessment (NYSAA) is intended to provide students with severe 
cognitive disabilities the opportunity to participate in a statewide assessment that is both meaningful 
and academically challenging. Given the wide diversity of this student population, great emphasis is 
placed on ensuring that Grade-Level Expectations within the Common Core Learning Standards 
(CCLS) for English Language Arts (ELA) and mathematics and the New York State learning standards 
for science and social studies are accessible to all students. The assessment design allows students to 
demonstrate their knowledge, skills, and understanding of the CCLS through Extensions for ELA and 
mathematics and the New York State Learning Standards through the Alternate Grade-Level Indicators 
(AGLIs) for science and social studies. The Extensions and AGLIs are organized into three Levels of 
Complexity to provide an appropriate entry point for students into the standards and to maintain the 
academic focus of the alternate assessment. Student performance data—Level of Accuracy—is 


collected by the teacher for each Extension and AGLI that the student is assessed against. 


5.1 ALTERNATE PERFORMANCE LEVEL DESCRIPTORS (APLDs) 


The APLDs, developed for the standard setting that took place in June 2014, were first used for 
the 2013-14 administration and reporting. The same APLDs were used for the 2014—15 administration 
and reporting. The purpose of the standard setting conducted in June 2014 was to establish cut scores 
for each alternate performance level in ELA and mathematics, Grades 3 through 8 and high school; in 
science, Grades 4 and 8 and high school; and in social studies, for high school. 

The APLDs provided panelists with an idea of the knowledge, skills, and understanding related 
to the CCLS for ELA and mathematics and the core curriculum for science and social studies that a 
student at each of the four performance levels might demonstrate. A final activity during standard 
setting was for each group to provide suggestions for edits to the APLDs. The New York State 
Education Department (the Department) used the input to refine the APLDs for reporting. The APLDs 
are included in the NYSAA reports for districts, schools, parents/guardians, and educators to better 


explain each performance level. 


5.2 ACCESS TO THE GENERAL CURRICULUM 


The CCLS for ELA and mathematics contain grade-level content for pre-kindergarten through 
high school. Additionally, the core curricula for science and social studies contain grade-level content at 
the elementary, intermediate, and secondary levels. These core curricula are aligned with the New York 


State Learning Standards. 
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For the 2013-14 NYSAA, the Department, in cooperation with teacher committees from across 
the State, has expanded the CCLS in ELA and mathematics to Extensions for students with severe 
cognitive disabilities. Previously, the Department, in cooperation with special education teacher and 
general education teacher committees from across the state, had expanded the core curriculum Grade- 
Level Expectations in science and social studies to AGLIs for students with severe cognitive disabilities. 
Extensions and AGLIs provide an entry point to the grade-level content of the standards. Extensions 
and AGLIs measure a level of mastery of the knowledge, skills, and understanding aligned with the 


CCLS and core curricula established for all students by the New York State Board of Regents. 


5.3 TEST FORMAT 


The NYSAA is a collection of student work in the form of a datafolio. The NYSAA Test 
Blueprints outline for teachers the content to be assessed at each grade and content area combination. 
The NYSAA Test Blueprints for each content area are included in Appendix A. Each of the five content 
standards is required to be assessed for ELA and mathematics within each grade. Each of the two 
content standards is required to be assessed for science and social studies within each grade. ELA and 
mathematics are assessed in Grades 3 through 8 and high school. Science is assessed in Grades 4 
and 8 and high school. Social studies is assessed in high school. Extensions and AGLIs are presented 
in the NYSAA Administration Manual in a spectrum of increasing complexity: less, middle, more. 
Extensions and AGLIs must be used as written. An Assessment Task is aligned to a specific Extension 
or AGLI. It describes the student action being assessed and outlines the basic expectation of what will 
be demonstrated in the verifying evidence. Teachers must use the Assessment Task as written, but, in 
most cases, there is more than one Assessment Task aligned to a specific Extension or AGLI that a 
teacher may select. Allowing teachers to select an Extension and AGLI and to then choose an 
Assessment Task aligned to that Extension or AGLI, results in individualization while maintaining the 
content consistency of the alternate assessment. Consistency is further ensured across grade levels 
and content areas by adherence to strict administration requirements for datafolios. 

A datafolio is the resulting body of evidence of a student’s academic performance across the 
content standards of selected Extensions or AGLIs, as compiled by the student’s instructional team and 
scored by qualified Scorers. For each standard in ELA and mathematics, there are three Extensions 
presented across a spectrum of complexity from least to most complex. For each standard in science 
and social studies, AGLIs are presented across a spectrum of complexity from least to most complex. 
Teachers select the Extension or AGLI most appropriate for a student and conduct the assessment. 
Student performance is rated by the student’s instructional team according to the student’s Level of 
Accuracy in performing each Assessment Task. Two dates of student performance are documented for 
each standard in all content areas. Teachers first administer a baseline data point to collect 


performance data and evidence that confirm the student has not yet mastered the assessed skill. 
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Based on the outcome of the baseline data point, it may be necessary to adjust the Level of Complexity 
(choose another task at a higher or lower Level of Complexity). Following the baseline data point, 
teachers provide instruction and evaluate students to gauge student growth. Before the end of the 
administration period, a final administration is conducted and documented in the datafolio. In general, 
the Department recommends at least 15 school days between the baseline and final data points. To 
verify the baseline and final data points’ documentation, each datafolio must include verifying evidence 
that demonstrates the student’s performance of the task. Teachers may choose to submit the following 
as evidence: student work products, Data Collection Sheets, photographs, and/or digital video or audio 
recordings for the baseline administration performance and the final administration performance. 
Teachers complete the required forms, and submit all documentation and evidence in a binder or 
fastened folder for regional scoring. Detailed information about the content of and procedures for 


developing the datafolio is presented in the NYSAA Administration Manual. 


5.4 ASSESSMENT DIMENSIONS 


NYSAA datafolios are scored using two dimensions: 


"Connection to Grade-Level Content 
The Connection to Grade-Level Content dimension is met when: 
o The Extension or AGLI is from the student’s assessed grade; 
o the Assessment Task is clearly aligned with the Extension or AGLI; and 


othe verifying evidence submitted is aligned with the Assessment Task. 


Both of the connections must be clearly evident for the standard to be scored. 


=» Performance 


o Level of Accuracy is calculated as a percentage (0%-—100%). 


Level of Complexity is part of the NYSAA test design and, in addition to Level of Accuracy, factors into 


a student’s overall performance level. 
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CHAPTER6 ALIGNMENT 


6.1 PROMOTING ALIGNMENT THROUGH ACHIEVEMENT LEVEL DESCRIPTORS 


The Alternate Performance Level Descriptors (APLDs) for the New York State Alternate 
Assessment (NYSAA) are uniquely defined, by the use of unifying adverbs, for each grade and content 
area. The APLDs provide a structure for understanding the knowledge, skills, and understanding that a 
student may have demonstrated in their NYSAA datafolio at a performance level. The APLDs are 
meant to be a guide or a framework to give a picture of student performance. Due to the varying 
abilities of students with severe cognitive disabilities, the APLDs were developed to be a flexible 
definition of student performance on the NYSAA. The student performance documentation that is 
recorded and evidenced within the datafolio is a more prescribed and quantified system of 
documentation. 

The development of APLDs occurred in 2014 as part of the NYSAA redesign. The APLDs for 
each grade and content area provided panelists participating in standard setting with the official 
description of the knowledge, skills, and understanding that students are expected to display for each 
performance level. The APLDs were developed by using the old APLDs and the general education 
Performance Level Descriptors for Grades 3 through 8 in English Language Arts (ELA) and 
mathematics. The initial language was developed by the Regional Lead Trainers (RLTs; see Chapter 7) 
and was then refined by Measured Progress. The APLDs were reviewed, edited, and approved by the 
Department. 

The standard-setting panelists were able to come to a consensus with a generalized 
understanding of the terms described above due to their extensive knowledge of the NYSAA student 
population combined with understandings of the New York State Common Core Learning Standards 
(CCLS) for ELA and mathematics and the New York State core curricula for science and social studies. 
The APLDs provide information related to specific content assessed within a grade and content area 
and how that content skill may be performed by a student through his or her accuracy level. Many 
students who take the NYSAA have splinter skills, require various supports to perform, and can vary 
from day to day in their demonstration of the knowledge, skills, and understanding that they are working 
on within the datafolio. As such, the terms used within the APLDs provide some parameters and 
flexibility to allow for a basic picture of student performance without being specifically quantified. A set 
quantification would not allow for a representative understanding of a student with severe cognitive 
disabilities who took the NYSAA. 
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CHAPTER7 ADMINISTRATION AND TRAINING 


New York State utilizes a train-the-trainer model to provide training related to the New York 
State Alternate Assessment (NYSAA). Each Board of Cooperative Educational Services (BOCES) and 
Big Five City School District designates at least one person as an Alternate Assessment Training 
Network Specialist (AATN Specialist) and at least one person as a Score Site Coordinator (SSC). (The 
Big Five City School Districts are Buffalo, New York City, Rochester, Syracuse, and Yonkers.) AATN 
Specialists and SSCs participate in the regional Administration Training conducted in September and 
facilitated by the Department and Measured Progress. The AATN Specialist is responsible for 
conducting the NYSAA Administration Training with teachers. SSCs are responsible for the 
coordination of the regional Scoring Institutes; therefore, they also need to have an understanding of 
the NYSAA administration guidelines. In addition, nine Regional Lead Trainers (RLTs) provide technical 
assistance to assigned geographic regions across the state. The RLTs assist with administration and 
Scoring Training and Collegial Review processes, as well as provide support to teachers throughout the 


administration period via e-mail and telephone. 


7.1 STEPS FOR ADMINISTRATION 


The teacher, in coordination with the instructional team, is responsible for administering the 
NYSAA to a student. The NYSAA Administration Manual provides detailed guidelines on how to 
administer the NYSAA to a student. The NYSAA has a specific administration period during which the 
assessment can be conducted. Assessment data cannot be collected before or after the administration 
period. The administration period for 2014—15 was September 29, 2014, to February 27, 2015. The first 
step is to review the Individualized Education Program (IEP) for a student who has been designated to 
take the NYSAA and determine the grade that the student will be assessed at, using the student's date 
of birth and the NYSAA Age Range Chart. Next, the teacher determines the Extension or Alternate 
Grade-Level Indicator (AGLI) for each content standard on which the student will be assessed. For 
English Language Arts (ELA) and mathematics, five standards are assessed. For science and social 
studies, two standards are assessed. Then, the teacher determines an Assessment Task that will 
demonstrate the Extension or AGLI. The Assessment Task describes the student action being 
assessed. Once the Extensions or AGLIs and Assessment Tasks have been determined, the teacher 
conducts the Assessment Task with the student as a baseline administration. The baseline data point 
confirms that the student has not yet mastered the skill being assessed. If the student performance is 
74% or below, then the teacher can continue to assess that skill. If the student performance is 75% or 
higher, then a higher-level skill must be assessed. If this is the case, the teacher would need to conduct 


a new baseline administration. Following the baseline administration, there is a period of instruction and 
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evaluation of the skill being assessed. Then, the teacher conducts the Assessment Task with the 
student as a final administration. The baseline data point and final data point administrations of the 
Extension or AGLI and Assessment Task are recorded and documented. Student performance includes 
the student’s Level of Accuracy. Verifying evidence showing the student demonstrating the knowledge, 
skills, and understanding of the Extension or AGLI through the completion of the Assessment Task 
must be included for the baseline and final administration dates of student performance documented. 
There are four types of verifying evidence that can be included, each with specific guidelines on what 
must be included for it to be considered a valid piece of evidence at scoring. The four types are student 
work products, a sequence of captioned and dated photographs, digital video or audio clips, and Data 
Collection Sheets. Each datafolio is required to have at least one Collegial Review. Collegial Review 
requires a colleague or administrator who is familiar with the NYSAA, but is not the student’s teacher 


who prepared the datafolio, to review the student's datafolio contents. 


7.2 STEPS IN CONSTRUCTING THE DATAFOLIO 


The NYSAA Administration Manual provides specific information on the construction and 
organization of the datafolio. For each Extension or AGLI, there must be a Data Summary Sheet. The 
Data Summary Sheet is the summarizing information regarding the Extension or AGLI. It includes 
student demographic information, the Extension or AGLI assessed, the Assessment Task, and student 
performance data. The baseline and final administration dates of performance data include the 
percentage for the Level of Accuracy. Also documented on the Data Summary Sheet for the baseline 
and final administrations is a yes or no to indicate if a student received verbal or physical cues or 
prompts to redirect or refocus the student on the Assessment Task. In addition to the individual 
requirements of each type of verifying evidence, the verifying evidence must include three required 
elements—student name, date of student performance, and Level of Accuracy. The teacher is 
responsible for ensuring that the verifying evidence connects to the Assessment Task and that it meets 
the requirements outlined in the NYSAA Administration Manual, in order to include it in the datafolio. On 
or before the end of the administration period, the teacher assembles the datafolio in a binder or 
fastened folder. The datafolio includes a NYSAA Student Page, which provides demographic 
information regarding the student and the grade assessed, supports required per the IEP, 
accommodations provided during testing, and the month that a Collegial Review was conducted. 
Although not required, a datafolio also includes a table of contents, which provides information to 
Scorers on where information is located in the datafolio. The ELA assessment documents come first, 
followed by mathematics, then science and social studies, if applicable. The Extensions or AGLIs within 
each content area are organized by using the numbers in the boxes in the upper-right corner of the 


Data Summary Sheets (Extension 1-Extension 5; AGLI 1-AGLI 2). 


For ELA and mathematics, the order of the documents is as follows: 
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Extension 1: 
e Data Summary Sheet 
e Verifying evidence for the baseline data point 
e Verifying evidence for the final data point 


o If either piece of verifying evidence is a Data Collection Sheet, the supporting 


evidence directly follows the Data Collection Sheet. 


This order is repeated for the remaining Extensions 2, 3, 4, and 5 in ELA and mathematics. 


Figure 7-1. 2014-15 NYSAA: Datafolio Elements for ELA and Mathematics 


ELA and 
Mathematics 


(ALL Grades) 


Extension 1 DSS Extension 2 DSS Extension 3 DSS Extension 4 DSS Extension 5 DSS 
VE for Baseline VE for Baseline VE for Baseline VE for Baseline VE for Baseline 
Data Point Data Point Data Point Data Point Data Point 


VE for 
Final Data Point 


For science and/or social studies, the order of documents is as follows: 
AGLI 1: 
e Data Summary Sheet 
e Verifying evidence for the baseline data point 
e Verifying evidence for the final data point 
o If either piece of verifying evidence is a Data Collection Sheet, the supporting 
evidence directly follows the Data Collection Sheet. 
AGLI 2: 
e Data Summary Sheet 
e Verifying evidence for the baseline data point 
e Verifying evidence for the final data point 
oO If either piece of verifying evidence is a Data Collection Sheet, the supporting 


evidence directly follows the Data Collection Sheet. 


2014-15 NYSAA Technical Report: Chapter 7—Administration and Training -27- 


Figure 7-2. 2014-15 NYSAA: Datafolio Elements for Science and Social Studies 


Science and Social 
Studies 


(Grade specific) 


AGLI 1 AGLI 2 


Data Summary Data Summary 
Sheet Sheet 


VE for Baseline VE for Baseline 
Data Point Data Point 


VE for Final Data VE for Final Data 
Point Point 


7.3 ADMINISTRATION TRAINING AND COLLEGIAL REVIEW 


In September 2014, the Department, in collaboration with Measured Progress, trained AATN 
Specialists and SSCs from across the state on how to conduct the NYSAA Administration Training with 
teachers in their regions. The one-day trainings were conducted regionally across the state over a two- 
week period. There were three main activities conducted. First, information regarding updates to the 
NYSAA and the materials was provided. Then, the NYSAA Administration Training DVD was shown. 
The training also included the completion and review of the Guided Practices. Last, the participants 
were asked to work in groups to discuss strategies to improve administration practices and how best to 
support teachers administering the 2014-15 NYSAA. 

A total of five NYSAA Administration Trainings occurred at four geographically diverse sites: the 
Albany region, which included Long Island and the regions surrounding New York City; the Syracuse 
region; the Buffalo and Rochester region; and the New York City region, which included the non-District 
75 trainers on one day and the District 75 trainers on another day. Table 7-1 outlines the number of 
participants at each training session. 


Table 7-1. 2014-15 NYSAA: Administration Updates Training—Participant Count 


Buffalo- |New York City 
Rochester Region(Two Total 
Region Trainings) 


Albany Syracuse 
Region Region 


NYSAA 
Administration 60 21 48 119 248 
Updates Training 


Administration Training for teachers is provided through a combination of Guided Practices and 
a training DVD. AATN Specialists are required to use all parts of the DVD and Guided Practices, as 
specified by the Department. The NYSAA Administration Training DVD is organized into segments. 
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There is an Opening segment; a Department Messages segment; a Steps for Administration segment; 
and a Best Practices, Recommendations, and Closing segment. An additional optional segment is 
provided, which is an introduction to an online tool that teachers can use to support their administration 
practice. The opening segment provides general information about what is going to be covered during 
the training session. The Department Messages segment provides the background for the 
implementation of the NYSAA test design and alignment of ELA and mathematics to the CCLS, and 
responses to several frequently asked questions related to the assessment process for the NYSAA. 
The Steps for Administration segment is a detailed review of each of the steps for administering the 
NYSAA, including things to consider in planning for the assessment, specifics regarding administering 
the assessment, and an outline of steps for assembling and submitting the datafolio for scoring. The 
information provided in this segment follows the organization of the NYSAA Administration Manual, and 
includes many visuals to assist teachers in understanding the NYSAA. The Best Practices, 
Recommendations, and Closing segment provides best practices tips and strategies on how to 
maintain the Connection to Grade-Level Content during administration, information on prompts and 
cues, and things to keep in mind for the 2014-15 NYSAA, as well as next steps for teachers and 
information regarding Collegial Reviews. At specific points throughout the segments, there are stop 
points built in, and a Guided Practice must be conducted. The Guided Practices reinforce the 
information that was contained in the segment. There are a total of four Guided Practices. The first 
Guided Practice focuses on understanding how to use a student’s date of birth to determine the correct 
grade at which a student should be assessed, and how to use the Test Blueprints in the NYSAA 
Administration Manual: Appendix F—NYSAA Frameworks 
(http:/Awww.p12.nysed.gov/assessment/nysaa/nysaa-manual-15.html). The second Guided Practice 
focuses on understanding how to navigate through the NYSAA Frameworks to select the Extensions 
and AGLIs and how to determine some verifying evidence options for specific Extensions and AGLIs. 
The third Guided Practice focuses on determining and documenting baseline student performance and 
determining if the baseline threshold is exceeded. The fourth Guided Practice provides teachers with a 
review of a NYSAA requirements review worksheet. Teachers complete all four Guided Practices, and 
a review of the practices is facilitated by the AATN Specialists. At or before the locally conducted 
NYSAA Administration Trainings, teachers are provided with the NYSAA Administration Manual, which 
includes the NYSAA Frameworks as Appendix F. 

Collegial Review is required for each student datafolio. Collegial Review is an independent 


review of a datafolio. Reviewers should: 


= be familiar with the current alternate assessment; and/or 


«have attended the 2014—15 Administration Training in the fall of 2014. 
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The Department recommends that Collegial Reviews take place during the planning phase, ata 
midpoint during administration, and prior to the end of administration. The teacher is given feedback 
about whether the appropriate connections have been made between the Extensions or the AGLIs and 
the Assessment Tasks and between the Assessment Tasks and the verifying evidence. Also, Collegial 
Reviews help to confirm that all documents included in the datafolio at that point meet all procedural 
requirements. The Department cautions that a Collegial Review helps ensure, but does not guarantee, 
that a datafolio meets the procedural requirements necessary for a student to receive a reportable 


score. 
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CHAPTERS SCORING 


Alternate Assessment Training Network Specialists (AATN Specialists) and Score Site 
Coordinators (SSCs) participate in the regional Scoring Training conducted each year. SSCs are 
responsible for the coordination of the regional Scoring Institutes, and must pass the qualification 
samples in order to make content decisions during Scoring Institutes. AATN Specialists act as Floor 
Managers at Scoring Institutes, and must also pass the qualification samples in order to make content 
decisions during Scoring Institutes. 

In March 2015, the Department, in collaboration with Measured Progress, trained AATN 
Specialists and SSCs from across the State on how to score New York State Alternate Assessment 
(NYSAA) datafolios and how to conduct the NYSAA Scoring Training with Scorers at the Scoring 
Institute in their region. The one-day trainings were conducted regionally across the State over a two- 
week period. Three main activities were conducted. First, information regarding updates to the NYSAA 
Scoring Procedures, Decision Rules, and the scoring materials was provided. Then, the NYSAA 
Scoring Training DVD was shown. The training also included the completion and review of the Guided 
Practice samples. Last, participants were asked to complete a qualification process by scoring sample 
datafolios. Participants that did not meet the minimum performance requirements during the 
qualification process were retrained and provided with the opportunity to complete the qualification 
processes again with a new set of sample datafolios. 

A total of five NYSAA Scoring Trainings occurred at four geographically diverse sites: the 
Albany region, which includes Long Island and the regions surrounding New York City; the Syracuse 
region; the Buffalo and Rochester region; and the New York City region, which included the non-District 
75 trainers in one training session and the District 75 trainers in another training session. Table 8-1 


outlines the number of participants at each training session. 


Table 8-1. 2014-15 NYSAA: Scoring Training—Participant Count 


Buffalo- | New York City 
Rochester Region(Two Totals 
Region Trainings) 


Albany Syracuse 
Region Region 


NYSAA Scoring 


Mae 44 26 33 94 197 
Training 


8.1 SCORING OF OPERATIONAL TESTS 


The scoring of NYSAA datafolios occurs during the spring, following the close of the 
administration period. Scoring is a decentralized process carried out at regional Scoring Institutes. The 


Department provides a scoring window within which the institutes conduct their scoring sessions. The 
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purpose of the Scoring Institute is to provide a forum for educators to score the NYSAA student 
datafolios. Each Scoring Institute is overseen by an SSC and an AATN Specialist. These individuals 
are thoroughly trained and participate in a qualifying process conducted by the Department and 
Measured Progress. They are each given a duplicate set of training materials that are to be used during 
turnkey training at their own Scoring Institutes. They are required to follow the model of the training 
process demonstrated by the Department and Measured Progress. 

There are a variety of processes involved in the Scoring Institute. The basic outline for the 
review of student datafolios consists of three major steps. Scorers review student datafolios; confirm 
that the Connection to Grade-Level Content, including the baseline data point performance information, 
is satisfied; and verify the percentage and rating for Level of Accuracy is documented by the teacher for 
each Extension or Alternate Grade-Level Indicator (AGLI) assessed. Any questions that arise during 
scoring are directed to a Table Leader. Scorers use the document titled Steps for Scoring 2014-15 
NYSAA Datafolios as the main reference for scoring each datafolio. Table Leaders use the Decision 
Rules for Scoring 2014-15 NYSAA Datafolios as a reference document for any questions that are not 
addressed in the Steps for Scoring 2014—15 NYSAA Datafolios. Both documents are included in this 
report, as Appendices B (Scoring Procedures) and C (Scoring Decision Rules). 

On a worksheet, a Scorer records the Extension or AGLI code, Connection to Grade-Level 
Content questions, percentages for the Level of Accuracy for the baseline administration and final 
administration, whether or not the student was prompted, and any Scorer comments. Part of this 
worksheet is returned to the school district along with the datafolio for review by the instructional team 
and administrators. 

Once a datafolio has been reviewed completely, the last step is for the Scorer to transcribe the 
Extension or AGLI codes, Connection to Grade-Level Content questions, percentages, and other 
information onto a Scannable Score Document (Scannable). The score document is scanned by the 
Regional Information Center (RIC) or the Big Five City Scan Center. (The Big Five City School Districts 


are Buffalo, New York City, Rochester, Syracuse, and Yonkers, each having its own City Scan Center.) 


8.2 SCORING RUBRIC 


The Scoring Rubric is the initial guide that drives the model used to score NYSAA datafolios. 
The Scoring Rubric is provided in the 2014-15 NYSAA Administration Manual, along with guidance on 
the process that teachers must follow to meet the scoring requirements. The rubric is broken into two 
parts. The first part outlines the grades and content requirements, and provides some brief assessment 
requirements information. The second part provides information about the factors for a performance 
level. The factors included are the Connection to Grade-Level Content, student performance, and Level 
of Complexity. The Connection to Grade-Level Content is explained on the Scoring Rubric as follows: 


“Extensions/AGLIs are assessed based on the appropriate grade level academic content for students 


2014-15 NYSAA Technical Report: Chapter 8—Scoring - 32 - 


with severe cognitive disabilities. The Assessment Task must align to the Extension/AGLI chosen AND 
the verifying evidence must be aligned to the task. If these connections are not clear, the 
Extension/AGLI will not be scored.” The final administration Level of Accuracy provides the percentage 
for the performance dimension. For each Assessment Task documented, the percentage for Level of 
Accuracy (relative to the student’s demonstration of skills, in relation to the Extension or AGLI) and the 
Level of Complexity that the Extension or AGLI came from combine to give the overall performance 
level. The Scoring Rubric is presented in Table 8-2 and the Factors for a Performance Level are 
presented in Table 8-3. 
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Table 8-2. 2014-15 NYSAA: Scoring Rubric 


Students with disabilities participating in the NYSAA are assessed according to chronological ages aligned to grade levels. Refer to the Age Range Chart 
for current date of birth ranges. Students should be tested only once at each grade and in all of the content areas indicated for each grade. For all content 
areas, student performance data are collected on at least two dates within the administration period. Baseline data must be collected to confirm that the 
student has not yet mastered the selected Extension or AGLI. 


Grade ELA Mathematics Science Social Studies 


3 5 Standards 5 Standards 
4 5 Standards 5 Standards 2 Standards 
5 5 Standards 5 Standards 
6 5 Standards 5 Standards 
7 5 Standards 5 Standards 
8 5 Standards 5 Standards 2 Standards 
High School 5 Standards 5 Standards 2 Standards 2 Standards 
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Table 8-3. 2014-15 NYSAA: Factors for a Performance Level: Connections to Grade-Level Content, Performance, Level of Complexity 


Connection to Grade-Level Content = Extensions/AGLIs are assessed based on the appropriate grade level 
academic content for students with severe cognitive disabilities. The Assessment Task must align to the 
Extension/AGLI chosen AND the verifying evidence must be aligned to the task. If these connections are not clear, 
the Extension/AGLI will not be scored. 


Connection to Grade-Level Content Progression: _ 


Assessment Task Verifying evidence 


aligned to 


Extension/AGLI from aligned to Assessment 


Task 


Grade 


Extension/AGLI 


Performance = Level of Accuracy (%) 

The student demonstrates skills based on the Extensions or AGLIs resulting in a 
percentage for Level of Accuracy. 

Was the student prompted in any way during the administration of the Assessment 
Task? Yes or No. 


Level of Accuracy 


Independence 


Level of Complexity Less Complex Middle More Complex 


No or No Score (NS) results when one or more of these issues are identified during scoring (including but 
not limited to) 
Connection to Grade-Level Content 


Level of Complexity 
Score for baseline 


Performance 


assessed 

Extension or AGLI assessed 
from incorrect grade 
Incorrect Assessment Task 
assessed 

Verifying evidence does not 
demonstrate task 


e Required standard not e 


Required data points and/or e 


evidence not submitted 
Required elements not 
documented on evidence 
Verifying evidence not valid 


administration over 
threshold (Level of 
Accuracy is 75% or higher) 
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8.3 SCORING PROCESS AND RELIABILITY MONITORING REVIEW 


8.3.1 Scoring Process 


Scorers, who are all New York State teachers or other licensed and/or certified professionals, 
are directed to objectively review and document the ratings for student performance data contained in 
the datafolio. During the Scoring Training, it is explained that the data provide an opportunity for 
students to demonstrate their knowledge, skills, and understanding of the grade-level content. Scoring 
processes are consistent from one grade level to the next. The same procedures and rules apply to all 
grade levels and content areas, which is critical to the procedural validity of the assessment. 

Scoring Training includes a DVD presentation, a series of guided practice samples, and the 
Scorer qualification process. (These are described in further detail in the next section.) 

The actual scoring process involves reviewing the datafolio compiled by the teacher. The review 
is meant to ensure that all of the requirements are met. The Scorer records the rubric rating for each 
Extension or AGLI assessed. If the Connection to Grade-Level Content and the baseline administration 
performance are satisfied, the final performance percentages can be confirmed, and each performance 
percentage for baseline and final administrations can be recorded by the Scorer. If the Connection to 
Grade-Level Content is not met or the baseline administration performance is above the percentage 
threshold, a rating of No Score (NS) is recorded. After the Scoring Institute, the Scorer ratings are 
converted to the alternate assessment performance levels, which appear on the NYSAA reports. 

In order for Scorers to complete their review of the datafolios, a set of standardized tools is 
provided to each Scoring Institute. These tools include the NYSAA Administration Manual and 
Frameworks, Scoring Procedures, Scoring Decision Rules, Guided Practices, and qualifier sets. 
Student performance ratings are documented on a Scorer Worksheet, with a Menu of Comments, and 
a Scannable. The Menu of Comments, located on the back of each page of the Scorer Worksheet, 
includes information that a Scorer records when an Extension or AGLI has an NS rating. It also allows 
the Scorer to provide additional constructive feedback to a teacher about the datafolio. 

There are 14 steps involved in the scoring process. The step-by-step procedures outlined in the 
Steps for Scoring 2014—15 NYSAA Datafolios are implemented statewide and ensure scoring reliability 


across all Scoring Institutes. Table 8-4 presents a quick review of the steps. 
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Table 8-4. 2014-15 NYSAA: Scoring Steps Quick Reference 
Step(s) 


1 Student demographics, Scorer ID, Scoring Institute code, confirm student’s date of birth and grade 
level assessed, Testing Accommodations, and Collegial Review 
2aandb Review sequence of documentation for content area 


3 Demographic information on DSS complete and accurate when compared to the Student Page 

4 Extension or AGLI from grade level (Connection to Grade-Level Content) 

5 Task connects to Extension or AGLI (Connection to Grade-Level Content) 

6a, b, Verifying evidence connects to task (Connection to Grade-Level Content), and Level of Accuracy for 
andc the baseline data point is 74% or below 

7 Dates on DSS within the administration period 

8a-f Valid verifying evidence and supporting evidence 


a_ Valid verifying evidence and supporting evidence: required elements clearly documented (3) 

Valid verifying evidence: Student Work Product: Original 

Valid verifying evidence: Data Collection Sheet (DCS): Minimum of three dates; includes supporting 
evidence and staff initials 

d_ If verifying evidence is DCS, supporting evidence is present and valid 

e Valid verifying evidence: Photographs: Minimum of three sequential, captioned, and dated 


omen 


photographs 
f Valid verifying evidence: digital video or audio clip: Clip is brief and has recorded markers 
9 Supports provided that guided the student to the correct answer 
10 Confirm final administration percentage for Level of Accuracy, record percentages for Level of 


Accuracy for final and baseline administrations, record if student was prompted for final and baseline 
administrations 


11 Score the second Extension or AGLI (Steps 3-10) 

12 Score mathematics, science, and social studies (Steps 2—11) 

13 Confirm Scorer Worksheet is complete, including Procedural Error Comments and additional Scorer 
Comments 

14 Complete the Scannable Score Document 


The Scoring Procedures are separated into two major sections: preparing to score, and 
reviewing and scoring a datafolio. Each step asks the Scorer a question or directs the Scorer to confirm 
a certain requirement. The steps are presented in a yes/no format to assist the Scorer in moving from 
one step to another. If a Scorer encounters a “no” or an issue outside of the directions provided in the 
Scoring Procedures, he or she must consult with the Table Leader. The Table Leader refers to the 
Decision Rules for Scoring 2014—15 NYSAA Datafolios, if the information on how to proceed in scoring 
the datafolio is not already provided in the Scoring Procedures. 

The Scoring Decision Rules have their own segment in the training DVD. There is also a brief 
overview of the Decision Rules within the Scoring Procedures segment of the training DVD. The 
Decision Rules serve as guidance for Table Leaders when a Scorer encounters an issue that is outside 
of the direction provided in the Scoring Procedures document. The rules are organized by topic, 
beginning with rules that apply to the datafolio as a whole (e.g., incorrect forms, missing Student Page, 
evidence of photocopies, correction fluid/tape or black out). The other topic headings are “Assessment 
Task,” “Verifying Evidence,” and “Dates.” Fifteen Decision Rules were developed that are based on 


actual datafolio issues found during a Benchmarking review of datafolios in progress. In the training 
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DVD, Scoring Decision Rules are presented by number as found in the Decision Rules chart. If 
possible, an example is provided that highlights the point of the Decision Rule, and a description is 


provided regarding how the rules are to be consistently applied statewide at each Scoring Institute. 


8.3.2 Reliability Monitoring Review 


The purpose of the Reliability Monitoring Review (RMR) is to ensure scoring consistency and 
reliability across Scoring Institutes. 

At the end of the Scoring Institute, 20% of the scored datafolios from each scoring site are 
randomly collected by the SSC for the RMR. Measured Progress conducts a Scoring Institute in which 
the random datafolios are scored by highly experienced and qualified Scorers. RMR Scorers complete 
the same NYSAA training and qualification process that is used statewide. 

RMR scores are compared with the original scores from the regional Scoring Institutes. The 
Original score remains the score of record; the RMR score does not change or affect the original score 


in any way. The 2014—15 RMR results are presented in Chapter 10. 


8.4 SCORER QUALIFICATION AND TRAINING 


A standardized statewide process for Scorer Training and qualification is observed. Each Board 
of Cooperative Educational Services (BOCES) and Big Five City School District conducts at least one 
two-day Scoring Institute during the scoring period. For 2014—15, the scoring period was March 16, 
2015 to May 1, 2015. The same training and scoring process, Scoring Procedures, and Decision Rules 
were applied and implemented statewide. 

The DVD presentation portion of the training includes a welcome and an introduction, which 
briefly outline the DVD segments and documents used during training. The DVD then outlines the 
scoring tools, the step-by-step process for reviewing the datafolios and documenting student scores, 
and the practice scoring that is done while following along with the DVD segment. The first practice is 
completed according to directions outlined in the DVD segment. The first Extension is completed as 
part of the DVD segment. The DVD segment is then paused and participants complete the second 
Extension. The second Extension is completed as a group or in pairs. The DVD segment provides 
details about how the second Extension should have been scored. 

After the first two DVD segments, Scorers practice scoring two additional datafolio samples with 
two Extensions or AGLIs—first as a group or in pairs, and then individually. Each practice is reviewed 
to ensure that Scorers are following the Scoring Procedures accurately. The final DVD segment details 
other best practice information for scoring, and reinforces information about confirming connection of 
the verifying evidence to the Assessment Task and Data Collection Sheets. It also provides details 


about the subsequent steps in Scorer Training. 
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After the DVD, Scorers are given an opportunity for final questions. Training ends with Scorers 
completing three calibrated qualifiers with two Extensions or AGLIs each. The qualifiers are actual 
student datafolios in a content area. The qualifiers were identified by a group of special education 
teacher and general education teacher committees during a Benchmarking process. Each Scorer must 
earn a score of 80% or higher to become qualified. Scorers who do not qualify on the first qualifier set 
receive additional training and must complete an additional qualification sample. After the initial set, 
Scorers have three opportunities to receive retraining and to qualify. If a Scorer does not qualify after 


three additional attempts, he or she is reassigned to another role in the Scoring Institute. 


8.5 SCORING QUALITY CONTROL 


The Quality Control Process at each Scoring Institute is handled by the SSC, AATN Specialists, 
and Table Leaders. The SSC is responsible for planning, conducting, and coordinating NYSAA scoring 
activities for the regional Scoring Institute. Each BOCES or Big Five City School District designates at 
least one individual to assume the role of SSC. 

SSC responsibilities include: 

* ensuring that the Scoring Procedures, Decision Rules, and other scoring-related 
guidelines are implemented consistently per the Department’s prescribed model; 


* ensuring the lock-and-key security of all datafolios during storage and throughout all 
scoring sessions (datafolio security must be maintained throughout this process); 


* gathering the NYSAA student registration information from the RIC or Big Five City Scan 
Center to assist in planning the Scoring Institute; 


# planning, coordinating, and conducting the Scoring Institute for each BOCES or Big Five 
City School District; 


" being present at all times while scoring is in session; 


"coordinating the selection of sample datafolios as requested by the Department for 
RMR; 


= ensuring that scoring documentation is completed and provided to the RIC or Big Five 
City Scan Center; 


* collecting feedback regarding the Scoring Institute from AATN Specialists, Table 
Leaders, and Scorers; 


" providing feedback to the Department about the scoring process, procedures, and 
documentation; and 


« returning datafolios following scoring. 


AATN Specialists are designated by each BOCES or Big Five City School District to conduct 


information sessions and NYSAA training and to assist with scoring. 
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For NYSAA scoring, AATN Specialists: 


* assist SSCs in the planning of the Scoring Institute, as needed; 


* conduct training sessions and facilitate qualification sessions for Table Leaders and 
Scorers; 


* act as Floor Managers during the scoring process; 

"resolve Table Leader questions, using scoring guidelines and resources; 
"participate in the Read Behind Process; and 

" provide feedback to SSCs and the Department about the scoring processes, 


procedures, and documentation. 


Table Leaders are integral to making sure that the processes and procedures outlined by the 
Department in the Scoring Training are followed at each scoring station during each Scoring Institute. 
There is one Table Leader for every five Scorers. 

For NYSAA Scoring, Table Leaders must: 


= be experienced Scorers familiar with the 2014-15 NYSAA; 


= complete Scoring Training, including the qualification process, prior to the start of the 
Scoring Institute; 


"coordinate the datafolio flow at their assigned scoring stations; 

= resolve questions from Scorers, using scoring guidelines and resources; 

» review and confirm all adjustments and all No Scores documented by Scorers; 
"conduct quality control checks of scored datafolios; 

=" manage the Read Behind Process; 

= separate copies of the Scorer Worksheet, as designated by the SSC; 

» return scored datafolios to the appropriate boxes; 


" provide feedback to the SSC and the Department about the scoring process, 
procedures, and documentation; and 


" assist as needed in evaluating and providing additional training for Scorers who do not 
qualify during the first round of qualifying. 


The Table Leaders are responsible for three main quality control checks. Their first responsibility is to 
resolve Scorer questions and to confirm NS ratings. When a Scorer questions the Connection to 
Grade-Level Content or has a question about scoring a datafolio that may result in an NS, the datafolio 
must be reviewed with the Table Leader. If the issue cannot be readily resolved by the Table Leader by 
using the Scoring Procedures and Scoring Decision Rules, it must be brought by the Table Leader to 
the Floor Manager. If the issue cannot be readily resolved by the Floor Manager, the SSC will make the 


final decision. 
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The second responsibility of a Table Leader is to complete a standardized quality control check. 
A quality control check is conducted by the Table Leader once a datafolio has been scored and 
returned by a Scorer. The Scorer Worksheet is cross-checked against the Scannable. Any corrections 
made to the ratings by the Scorer are double-checked, and comments are confirmed as being 
appropriate. A blue dot is affixed by the Table Leader to confirm that the quality control check was 
conducted. 

The third responsibility of a Table Leader is to manage the Read Behind Process. The Read 
Behind Process occurs throughout the Scoring Institute. This process ensures the integrity of scoring 
across scoring stations. Table Leaders select the first, third, and then every seventh datafolio from each 
Scorer for a read behind. The Scannable is pulled and held by the Table Leader, and a red dot is 
placed on the datafolio. This indicates that it has been selected for a read behind. The first Scorer 
scores the datafolio, completes the Scorer Worksheet, and returns the datafolio to the Table Leader. 
The Table Leader turns the Scorer Worksheet over, places it into the front pocket of the datafolio, and 
then routes the scored datafolio to be scored at a different scoring station or a read-behind table for the 
second read. The second Scorer scores the datafolio, completes a second Scorer Worksheet, and 
returns the datafolio to the Table Leader. The Table Leader (either at the first scoring station or read- 
behind table) compares the two worksheets. If no discrepancy exists, the Table Leader fills in his or her 
Scorer ID# and completes the Scannable. A quality control check is completed, and a blue dot is affixed 
to the datafolio. The second Scorer Worksheet is destroyed. If a discrepancy between the scores Is 
found, the Table Leader highlights the discrepant areas and forwards the datafolio to the Floor Manager 
or SSC for resolution. The Floor Manager or SSC reviews the discrepant areas, enters his or her 
Scorer ID#, and completes the Scannable. The Floor Manager returns the datafolio to the Table Leader 
for quality control. After a datafolio has been through the Read Behind Process, the Table Leader 
completes a quality control check. The Table Leader then works with the Scorer to review the 
discrepancy and provides any training or support that is needed. If the Scorer continues to have 
discrepant scores, the Table Leader is then directed to consult the Floor Manager and/or the SSC to 
discuss additional training or reassignment. 

As an additional quality control check to confirm that the Scoring Institutes are following all of 
the processes and guidelines prescribed by the Department, score site observation visits to a sample of 
Scoring Institutes are conducted. Each year, the Department designates a set of sites to be monitored 
during their Scoring Institutes. The observation visits are conducted by the Regional Lead Trainers 
(RLTs). SSCs are notified if they are selected by the Department for observation. Observers cannot 
participate or assist in any part of the Scoring Institute. They cannot interact or provide technical 
assistance during the observation. An Observation Protocol Checklist is completed during the visit and 


is then submitted to the Department. 
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CHAPTER9 CLASSICAL ITEM ANALYSIS 


As noted in Brown (1983), “A test is only as good as the items it contains.” A complete 
evaluation of a test’s quality must include an evaluation of each item. Both the Standards for 
Educational and Psychological Testing (AERA et al., 2014) and the Code of Fair Testing Practices in 
Education (Joint Committee on Testing Practices, 2004) include standards for identifying quality items. 
While the specific statistical criteria identified in these publications were developed primarily for 
general—not alternate—assessments, the principles and some of the techniques apply within the 
alternate assessment framework, as well. 

Both qualitative and quantitative analyses were conducted to ensure that New York State 
Alternate Assessment (NYSAA) items met these standards. Qualitative analyses are described in 
earlier sections of this report; this section focuses on the quantitative evaluations. The statistical 
evaluations discussed are difficulty indices, discrimination (item-test correlations), item means, 
structural relationships (correlations between the dimensions), and bias and fairness. The item 


analyses presented here are based on the statewide administration of the 2014—15 NYSAA. 


9.1 DIFFICULTY AND DISCRIMINATION 


For the NYSAA, each student datafolio for a specified content area at a given grade level 
receives an Accuracy score on each of five standards in English Language Arts (ELA) and 
mathematics, and on each of two standards in science and social studies. For each standard, teachers 
choose a task by which to assess their students. The chosen task may be at one of three possible 
Levels of Complexity (LOC: LOC1, LOC2, or LOC3), where the higher levels indicate greater 
complexity. For a given student, the LOC at which the student is assessed may differ from one 
standard to another. Thus, for any one standard, the number of students assessed varies across the 
LOCs, and the way that the student counts vary across the LOCs varies across the standards. Tables 
H-1 to H-18 in Appendix H, Classical Item Analysis, include the student counts for the three LOCs for 
each assessed standard. Table 9-1 summarizes the means and ranges of these counts. As can be 
seen in these tables, approximately 3,000 students were assessed at each grade level; on average, 
about 45% of the students were assessed at LOC1, about 42% at LOC2, and about 14% at LOC3. 
Note, however, that there also existed substantial variability across standards in how students were 
distributed across the LOCs. For example, for Grade 3 ELA, for Standard 314 about 75% of the 
students were assessed at LOC2, whereas for Standard 322 about 72% were assessed at LOC1. In 
general, for every grade level and content area, LOC1 and LOC2 combined to easily have the most 
assessed students. The largest count of students assessed at LOC3 occurred for Grade 7 ELA 
Standard 753, where the 1,278 students were only 38% of the total. 
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Table 9-1. 2014-15 NYSAA: Summary of Numbers of Students Assessed Across the Levels of Complexity 


Sine Gis LOC 1 LOC 2 LOC 3 

aah ps Mean Min Max Mean Min Max Mean Min Max 

3 1170.60 374 1,942 1178.20 601 2,022 366.60 158 505 

4 1260.60 500 2,002 1395.60 573 2,108 272.80 112 437 

English 5 1248.20 790 1,854 1662.80 1,014 2,200 169.80 77 246 
Language 6 1234.20 702 1,609 1680.80 1,369 2,381 249.60 76 556 
Arts 7 1932.00 1,231 2,387 850.80 289 1,869 523.80 157 1,278 

8 2000.20 1,694 2,237 980.40 626 1,249 351.00 242 459 

High School 1240.20 749 2,013 1186.20 428 1,587 420.80 324 519 

3 1115.00 590 1,756 1257.20 542 1,821 346.40 92 587 
4 1528.00 895 2,055 895.60 316 1,383 506.40 84 1,031 

5 1687.40 748 2,449 1042.20 440 2,088 348.60 155 777 

Mathematics 6 1505.20 1,227 1,845 1333.20 646 1,782 322.60 108 680 
7 1548.80 537 2,557 1153.00 317 2,353 609.80 181 1,165 

8 2168.20 1,866 2,821 810.00 376 §1,173 359.40 163 763 

High School 1142.40 643 1,494 1009.60 478 1,889 677.60 276 860 

4 1347.00 1,289 1,405 1257.00 1,205 1,309 308.00 292 324 

Science 8 1079.00 743 1,415 1755.00 1,320 2,190 498.00 404 592 
High School 856.00 654 1,058 1343.00 777 1,909 657.00 294 1,020 

Social Studies HighSchool 632.50 528 737 1793.00 1,662 1,924 422.50 388 457 


For each task, the teacher assessed each student a certain number of times (determined by the 
teacher), and the teacher recorded the number of times the student was successful. Because the 
number of times a student was assessed for each task was something teachers could decide for 
themselves, this number varied across students. Therefore, a percent-correct score was recorded 
rather than a number-correct score. Hence, for a standard assessed at a particular LOC, the observed 
student scores ranged from 0 to 100. 

To develop a single scale for comparing all students for a specific standard, percent-correct task 
scores associated with higher LOCs deserve more credit than the same scores at lower LOCs. Based 
on a scientific study jointly carried out by Measured Progress and the Department, a single scale was 
developed that allows scores at different LOCs for a given standard to be combined into a single scaled 
score for that standard. Specifically, the scaled score for a given standard is calculated by taking the 
observed percent-correct score, adding a credit when the LOC is a 2 or 3, and then dividing the result 
by 10. The complexity credit was 75 for an LOC of 2, and 150 for an LOC of 3. The division by 10 was 
needed to eliminate unwanted gaps in the scale, resulting from some combinations of scores being 
much less likely than others. Thus, the scaled scores for a given standard could range from 0 to 25. 


The formulas for the individual standard scaled scores by complexity are as follows: 
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SStoc1 = RSto0c1/10 
SStoce = (RSioc2 + 75)/10 
SStoc3 = (RStoc3 + 150)/10 


To compare students at the level of total test score, a scaled score that is simply the sum of the 
scaled scores on the standards plus an additive constant is produced. Specifically, for ELA and 
mathematics, the total score is the sum of the scaled scores on the five standards plus 400, resulting in 
total scores that can range from 400 to 525. And for science and social studies, the total score is the 
sum of the scaled scores on the two standards plus 550, resulting in total scores that can range from 
550 to 600. Thus, the formulas for the total subject standard scaled scores are as follows: 

SStotal ELA/Math = SSsta1 + SSstaz + SSstaz3 + SSstaa + SSstas + 400 
SSrotat sci/ss = SSsta1 + SSstaz + 550 

From the above, it is clear that there are two types of scores on this test that could be treated as 
traditional “item” scores for the purposes of psychometric evaluation. The standards are one 
reasonable choice to represent the traditional items on the test because each student is assessed on 
the same number of standards, and the sum of the scaled scores that he or she receives on the 
standards provides the basis for the total scaled score for a student. Alternatively, the application of a 
standard at a given LOC (S/LOC) could also be used to represent a traditional item because the raw 
percent-correct scores are directly comparable across students who were assessed on the same 
standard at the same LOC. 

Using both of these item representations (“Standard” and “S/LOC”), all items were evaluated in 
terms of item difficulty according to standard classical test theory practices. “Difficulty” was defined as 
the average proportion of points achieved on an item, and was measured by obtaining the average 
score on an item and dividing by the maximum score for the item. By computing the difficulty index as 
the average proportion of points achieved, the items are placed on a scale that ranges from 0.0 to 1.0. 
Although the p-value is traditionally described as a measure of difficulty (as it is described here), it is 
properly interpreted as an easiness index, because larger values indicate easier items. 

An index of 0.0 indicates that all students received no credit for the item, and an index of 1.0 
indicates that all students received full credit for the item. Items that have either a very high or a very 
low difficulty index are considered to be potentially problematic because they are either so difficult that 
few students get full credit or so easy that nearly all students get full credit. In either case, such items 
should be reviewed for appropriateness for inclusion on the assessment. 

It is worth mentioning that using a norm-referenced criterion such as p-values to evaluate test 
items is somewhat contradictory to the purpose of a criterion-referenced assessment like the NYSAA. 


Criterion-referenced assessments are intended primarily to provide evidence of student progress 
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relative to a standard, rather than to differentiate between students. Thus, the generally accepted 
criteria regarding classical item statistics are only cautiously applicable to the NYSAA. 

A desirable feature of an item is that higher-ability students perform better on the item than do 
lower-ability students. The correlation between student performance on a single item and total test 
score is a commonly used measure of this characteristic of an item. Within classical test theory, this 
item-test correlation is referred to as the item’s “discrimination” because it indicates the extent to which 
successful performance on an item discriminates between high and low scores on the test. The 
discrimination index used to evaluate NYSAA items was the Pearson product-moment correlation. The 
theoretical range of this statistic is -1.0 to 1.0. 

Discrimination indices can be thought of as measures of how closely an item assesses the 
same knowledge and skills assessed by other items contributing to the criterion total score. That is, the 
discrimination index can be thought of as a measure of construct consistency. In light of this 
interpretation, the selection of an appropriate criterion total score is crucial to the interpretation of the 
discrimination index. For the NYSAA, the total test scaled score, excluding the item being evaluated, 
was used as the criterion score. 

A summary of the item difficulty and item discrimination statistics for each grade/content area 
combination is presented in Tables 9-2 and 9-3 for the two kinds of items discussed above, “S/LOC” 
and “Standard,” respectively. As shown in Table 9-2, the mean difficulty values for S/LOC items ranged 
from 0.83 to 0.89, indicating that, overall, students performed well on the S/LOC items on the NYSAA, 
and that students, on average, were well prepared for the LOCs that were targeted by their instruction 
for each standard. On the other hand, as shown in Table 9-3, the mean difficulty values for the 
standards-based scaled item scores ranged from 0.47 to 0.62, with standard deviations (across the five 
standards) on the order of about 0.05. These results indicate that the difficulty levels of the five 
standards within a given assessment are similar to each other (small standard deviation) and are well 
aligned with the proficiency distributions of the students (means close to 0.50); neither overly difficult 
nor overly easy. In contrast to alternate assessments, the difficulty values for assessments designed for 
the general population tend to be in the 0.4 to 0.7 range for the majority of items. Because the nature of 
alternate assessments is different from that of general assessments, and because very few guidelines 
exist as to criteria for interpreting these values for alternate assessments, the values presented in 
Tables 9-2 and 9-3 should not be interpreted to mean that the students who took the NYSAA performed 
either better or worse than the students who took general assessments. 

Also shown in Tables 9-2 and 9-3 are the mean discrimination values for the S/LOC and 
standards-based items, respectively, as calculated by the correlation between the item scores and the 
scaled total scores. Because the majority of students received high scores on the S/LOC items and 
these raw scores are not adjusted for LOC, as is done for the scaled total scores, the discrimination 


indices are somewhat lower than one might otherwise expect, with mean values ranging from 0.29 to 
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0.52. In particular, if all of the students received high percent-correct scores on the S/LOC items, there 
is little variability in the item score for differentiating the criterion scores. On the other hand, for the 
standards-based items, the mean discrimination indices ranged from 0.61 to 0.76 for mathematics and 
ELA and 0.52 to 0.64 for science and social studies. These results indicate a strong positive 
relationship between the scaled scores on the standards-based items and the scaled total scores. As 
with the item difficulty values, because the nature and use of the NYSAA are different from those of a 
general assessment, and because very few guidelines exist as to criteria for interpreting these values 
for alternate assessments, the statistics presented in Tables 9-2 and 9-3 should be interpreted with 


caution. 


Table 9-2. 2014-15 NYSAA: Summary of Item Difficulty and Discrimination Statistics 
by Content Area and Grade for S/LOC Items 


Number _ Difficulty (p-Value) Discrimination 


Content Area Grade 
of Items = Mean SD Mean SD 
3 15 0.86 0.07 0.47 0.16 
4 15 0.87 0.06 0.39 0.17 
English 5 15 0.87 0.05 0.42 0.14 
Language 6 15 0.89 0.06 0.42 0.14 
Arts 7 15 0.88 0.04 0.38 0.13 
8 15 0.88 0.05 0.42 0.11 
High School 15 0.85 0.06 0.46 0.13 
3 15 0.84 0.07 0.52 0.15 
4 15 0.85 0.06 0.44 0.13 
5 15 0.85 0.05 0.46 0.16 
Mathematics 6 15 0.87 0.06 0.48 0.13 
it 15 0.85 0.07 0.47 0.10 
8 15 0.87 0.06 0.45 0.13 
High School 15 0.83 0.07 0.46 0.10 
4 6 0.89 0.06 0.29 0.14 
Science 8 6 0.87 0.06 0.35 0.11 
High School 6 0.86 0.06 0.37 0.10 
Social Studies High School 6 0.84 0.11 0.39 0.09 


Table 9-3. 2014-15 NYSAA: Summary of Item Difficulty and Discrimination Statistics by Content Area and 
Grade for Standards-Based Items 


Difficulty Discrimination 
Number 

Content Area Grade anes Scaled Score p-Value (Corr. w/Total) 
Mean Std Mean Std Mean Std 
3 5 13.71 2.17 0.55 0.09 0.70 0.04 
4 5 13.51 1.67 0.54 0.07 0.63 0.04 
English 5 5 13.39 1.10 0.54 0.04 0.61 0.02 
Language 6 5 13.89 0.71 0.56 0.03 0.68 0.02 
Arts 7 5 12.92 1.45 0.52 0.06 0.65 0.02 
8 5 12.33 0.48 0.49 0.02 0.71 0.03 
High School 5 13.76 1.34 0.55 0.05 0.75 0.03 

continued 
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Difficulty Discrimination 


Number 


Content Area Grade of items Scaled Score p-Value (Corr. w/Total) 
lean Std Mean Std Mean Std 
3 5 13.69 1.67 0.55 0.07 0.72 0.03 
4 5 13.16 1.68 0.53 0.07 0.70 0.04 
5 5 12.63 2.03 0.51 0.08 0.68 0.05 
Mathematics 6 5 13.15 0.65 0.53 0.03 0.76 0.04 
7 5 13.64 2.06 0.55 0.08 0.64 0.03 
8 5 11.79 1.38 0.47 0.06 0.72 0.03 
High School 5 14.46 0.31 0.58 0.01 0.68 0.03 
4 2 13.56 0.48 0.54 0.02 0.55 0.01 
Science 8 2 14.77 0.72 0.59 0.03 0.52 0.01 
High School 2 15.58 0.59 0.62 0.02 0.62 0.00 
Social Studies High School 2 15.39 0.13 0.62 0.01 0.64 0.02 


9.2 STRUCTURAL RELATIONSHIP 


By design, the performance level classification of the NYSAA is based on scaled scores 
associated with five standards for ELA and mathematics and on two standards for science and social 
studies. These different standards can be conceptualized as different construct dimensions of the 
assessment. As with any assessment, it is important that the dimensions composing the assessment 
be carefully examined. This was achieved by exploring the relationships between student scaled scores 
on the different dimensions with Pearson correlation coefficients. A very low correlation (near zero) 
would indicate that the dimensions are not related; a low negative correlation (approaching -1.00) would 
indicate that they are inversely related (i.e., that a student with a high score on one dimension had a 
low score on the other); and a high positive correlation (approaching 1.00) would indicate that the 
information provided by one dimension is similar to that provided by the other dimension. In addition, 
the correlation matrices for the standards were analyzed with factor analysis to determine the number 
of dimensions that are statistically significant for each assessment instrument analyzed. Because these 
assessments are unidimensionally scored, it is important to determine the degree to which 
unidimensionality accounts for the variability in the scores. 

The average correlations between the scaled scores on the standards by content area and 


grade are shown in Table 9-4. The detailed results for each pair of standards are given in Appendix |. 


Table 9-4. 2014-15 NYSAA: Average Correlations 


Average Correlation 

Gonlent Alea Grade Sac Standard Deviation 
3 0.58 0.06 
English 4 0.51 0.05 
Language 5 0.49 0.04 
Arts 6 0.57 0.02 
7 0.53 0.03 

continued 
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Average Correlation 
COnlent Alcea Grade eanece Standard Deviation 
English 8 0.61 0.03 
ee High School _—0.66 0.04 
3 0.61 0.04 
4 0.59 0.05 
5 0.57 0.06 
Mathematics 6 0.67 0.04 
7 0.52 0.04 
8 0.62 0.03 
High School 0.57 0.05 
4 0.59 
Science 8 0.55 
High School 0.65 
Social Studies High School 0.67 


The inter-item correlations ranged from a low of .44 (e.g., Grade 4, Standards 413 and 432) toa 


high of 0.73 (e.g., Grade 6 mathematics, Standards 606 and 608). The averages for a given grade and 


content area ranged from 0.49 to 0.67. These correlations indicate that the scores on the different 


standards within a grade and content area have strong positive relationships with each other. Next, a 


factor analysis was conducted on the correlation matrix for each grade and content area, to determine 


the degree to which a unidimensional scale can account for the variance in the scaled total scores. For 


science and social studies, because they have only two scored dimensions, the strong positive 


correlations are alone strong evidence in support of their unidimensional scales. As shown in Table 9-5, 


the factor analysis confirmed this by indicating that 77% to 84% of the variance in the correlations of the 


standards-based scaled scores is accounted for by a unidimensional scale. For ELA and mathematics, 


the factor analysis results indicate that a single scored dimension accounts for 59% to 74% of the 


variance in the correlations of the standards-based scaled scores. 


Table 9-5. 2014-15 NYSAA: Factor Analysis Table 


Percent of Variance Accounted for by Each Factor 


Grade Content Area 
1 2 3 4 5 

3 English Language Arts 66.5% 11.3% 8.5% 7.6% 6.1% 
Mathematics 68.9% 9.8% 8.2% 7.0% 6.2% 

English Language Arts 60.5% 12.1% 10.9% 8.8% 7.7% 

4 Mathematics 67.3% 9.9% 9.1% 7.1% 6.6% 

Science 79.5% 20.5% 

5 English Language Arts 59.1% 12.4% 10.4% 9.9% 8.3% 
Mathematics 65.6% 11.4% 8.8% 7.7% 6.5% 

6 English Language Arts 65.5% 9.7% 8.8% 8.3% 7.6% 
Mathematics 74.0% 8.1% 6.7% 5.8% 5.4% 

4 English Language Arts 62.4% 11.2% 9.6% 8.5% 8.3% 
Mathematics 62.1% 11.0% 10.2% 8.5% 8.2% 

continued 
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Percent of Variance Accounted for by Each Factor 
Grade Content Area 


1 2 3 4 5 

English Language Arts 69.3% 9.1% 7.6% 7.6% 6.3% 

8 Mathematics 69.4% 9.2% 7.9% 7.0% 6.5% 
Science 77.3% 22.7% 

English Language Arts 72.7% 8.6% 7.2% 6.2% 5.2% 

High School Mathematics 65.8% 10.3% 9.8% 7.2% 6.9% 
Science 82.5% 17.5% 
Social Studies 83.6% 16.4% 


Thus, these results also give strong support for the appropriateness of the use of a 


unidimensional scale for these assessment instruments. 


9.3 BIAS/FAIRNESS 


The Code of Fair Testing Practices in Education (Joint Committee on Testing Practices, 2004) 
explicitly states that subgroup differences in performance should be examined when sample sizes 
permit, and actions should be taken to make certain that differences in performance are due to 
construct-relevant, rather than construct-irrelevant, factors. The guidelines in the Code of Fair Testing 
Practices in Education are consistent with the relevant sections of the Standards for Educational and 
Psychological Testing (AERA et al., 2014). 

When appropriate, the standardization differential item functioning (DIF) procedure (Dorans & 
Kulick, 1986) is used to identify items for which subgroups of interest perform differently, beyond the 
effect of differences in overall achievement. However, because the NYSAA uses a datafolio that does 
not include standard items that are taken by all students, it was not possible to conduct DIF analyses. 

Although it is not possible to run quantitative analyses of item bias for the NYSAA, due to data 
limitations, fairness can be addressed through the assessment Blueprints, which are designed to reflect 
the core curriculum, as described in detail earlier in this report. The development of the assessment 
Blueprints, which reflect recommendations laid out in the Standards for Educational and Psychological 
Testing, were designed to ensure that the test is free of any insensitive or offensive material, as well as 
to ensure alignment with general education grade-level content and to promote higher expectations for 
students taking the NYSAA. 

Issues of fairness are also addressed in the NYSAA Administration and Scoring Procedures. 
Chapter 7 of this report describes in detail the procedures for administering the NYSAA and 
constructing the datafolio, as well as the training and review steps designed to ensure that the test is 
administered appropriately and consistently to all students. Chapter 8 describes, in detail, the Scoring 
Rubrics used, selection and training of Scorers, and scoring Quality Control Processes. These 


processes were followed to minimize bias due to differences in how individual Scorers award scores. 
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CHAPTER 10 CHARACTERIZING ERRORS ASSOCIATED WITH 
TEST SCORES 


One of the primary uses of the New York State Alternate Assessment (NYSAA) scores is for 
school-, district-, and State-level accountability in the federal No Child Left Behind (NCLB) Act 2001 
and in State accountability systems. The students are classified as Proficient or Not Proficient, and are 
included in the State’s Adequate Yearly Progress (AYP) calculation. In this case, the reliability of 
individual student scores, while not meaningless, becomes much less important. The scores have been 
collapsed for each student to a yes/no decision and then aggregated across students. 

For purposes of calculating reliability estimates, standards-based item scores are defined in the 
same way as described in Chapter 9. Specifically, the scaled scores on the five standards for English 
Language Arts (ELA) and mathematics and the two standards for science and social studies are treated 


as the item scores. 


10.1 RELIABILITY 


In the previous chapter, individual item characteristics of the 2014-15 NYSAA were presented. 
Although individual item performance is an important focus for evaluation, a complete evaluation of an 
assessment must address the way in which the items (or, in this case, standards-based items) that 
make up the test score function together and complement one another. Any measurement includes 
some amount of measurement error. No academic assessment can measure student performance with 
perfect accuracy; some students will receive scores that underestimate their true abilities, and other 
students will receive scores that overestimate their true abilities. Items that function well together 
produce assessments that have less measurement error (i.e., the error is small, on average). Such 
assessments are described as reliable. 

There are a number of ways to estimate an assessment’s reliability. One approach is to split all 
test items into two groups and then correlate students’ scores on the two half-tests. This is known as a 
split-half estimate of reliability. If the two half-test scores correlate highly, the items included are likely 
measuring very similar knowledge or skills. It suggests that measurement error will be minimal. 

The split-half method requires psychometricians to select items that contribute to each half-test 
score. This decision may have an effect on the resulting correlation, since each different possible split 
of the test halves will result in a different correlation. Another problem with the split-half method of 
calculating reliability is that it underestimates reliability, because test length is cut in half. All else being 
equal, a shorter test is less reliable than a longer test. Cronbach (1951) provided a statistic, alpha (a), 
that avoids the shortcomings of the split-half method by comparing individual item variances to total test 
variance. Cronbach’s a was used to assess the reliability of the 2014-15 NYSAA tests. The formula is 


as follows: 


2014-15 NYSAA Technical Report: Chapter 10—Characterizing Errors Associated with Test Scores - 50 - 


where 
7indexes the item, 
nis the number of items, 


2 ‘ ea F 2 
Cy, represents individual item variance, and 
L 


o% represents the total test variance. 


If the correlation is high (in practice, toward the high end of the typical Cronbach’s a range of 
0.50 to 0.99), the parts of the test are likely measuring very similar knowledge or skills. Thus, a high 
Cronbach's a coefficient is evidence that the standards-based items complement one another, and 
suggests that the assessment is reliable. Table 10-1 presents scaled total score descriptive statistics 
(maximum possible scaled total score, average scaled total score, and standard deviation), Cronbach's 
a coefficient, and scaled total score standard errors of measurements (SEMs) for each content area 
and grade. The results show that the reliability estimates range from 0.82 to 0.91 for ELA and 
mathematics, and range from 0.70 to 0.80 for science and social studies. The latter values are 
expected to be lower because those tests have fewer items. Considering that the NYSAAs are 
necessarily shorter than general assessments, the reliability coefficients in Table 10-1 give strong 


support to the reliability of the reported scaled total scores and their intended interpretations. 


Table 10-1. 2014-15 NYSAA: Scaled Total Score Descriptive Statistics, Cronbach’s Alpha, and 
Standard Errors of Measurement (SEM) by Grade and Content Area 


Wombemor Scaled Total Score 


Content Area Grade Siudenis. ain Mean Standard Feliability (a) SEM 


Deviation 
3 2,745 525 467.50 24.61 0.87 8.79 
4 2,967 525 466.36 21.61 0.83 8.85 
English 5 3,127 525 465.62 20.03 0.82 8.40 
Language 6 3,212 525 468.10 22.30 0.86 8.23 
Arts 7 3,361 525 463.28 24.72 0.84 9.87 
8 3,398 525 460.14 24.97 0.89 8.39 
High School 2,900 525 467.22 26.20 0.91 8.05 
3 2,746 525 467.41 24.70 0.88 8.45 
4 2,969 525 464.60 26.27 0.87 9.52 
5 3,129 525 461.82 23.45 0.86 8.67 
Mathematics 6 3,213 525 464.37 25.50 0.90 7.86 
7 3,364 525 466.82 25.47 0.84 10.19 
8 3,392 525 457.68 25.00 0.89 8.46 
High School 2,903 525 470.14 28.28 0.87 10.28 
4 2,953 600 576.50 10.54 0.74 5.38 
Science 8 3,382 600 578.83 10.48 0.70 5.75 
High School 2,896 600 580.43 11.59 0.77 5.59 
Social Studies High School 2,886 600 580.17 10.99 0.80 4.89 
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10.2 SUBGROUP RELIABILITY 


The reliability coefficients discussed in the previous section were based on the overall 
population of students who took the 2014—15 NYSAA. Subgroup Cronbach's as were calculated using 
the formula defined above, using only the members of the subgroup in question in the computations. 
These statistics are reported in Appendix D. Note that statistics are reported only for subgroups with at 
least 11 students. 

For several reasons, the results of this section should be interpreted with caution. First, inherent 
differences between grades and content areas preclude making valid inferences about the quality of a 
test that are based on statistical comparisons with other tests. Second, reliabilities are dependent not 
only on the measurement properties of a test but on the statistical distribution of the studied subgroup. 
For example, in Appendix D, it can be readily seen that subgroup sample sizes may vary considerably, 
which results in natural variation in reliability coefficients. Alternatively, a, which is a type of correlation 
coefficient, may be artificially depressed for subgroups with little variability (Draper & Smith, 1998). 
Additionally, there is no industry standard to interpret the strength of a reliability coefficient, and this is 


particularly true when the population of interest is a single subgroup. 


10.3 DECISION ACCURACY AND CONSISTENCY 


While related to reliability, the accuracy and consistency of classifying students into 
performance categories is an even more important issue in a standards-based reporting framework 
(Livingston & Lewis, 1995). Unlike generalizability coefficients, decision accuracy and consistency 
(DAC) can usually be computed with the data currently available for most alternate assessments. 
Based on the raw scale cut scores established for each content area via standard setting in June 2008, 
each student was classified into one of the following performance levels: Not Meeting Learning 
Standards, Partially Meeting Learning Standards, Meeting Learning Standards, and Meeting Learning 
Standards with Distinction. (Lookup tables for converting raw scores to performance levels are 
presented in Chapter 11.) This section of the report explains the methodologies used to assess the 
reliability of classification decisions and presents the results. 

Accuracy refers to the extent to which decisions based on test scores match decisions that 
would have been made if the scores did not contain any measurement error. Accuracy must be 
estimated, because errorless test scores do not exist. Consistency measures the extent to which 
classification decisions based on test scores match the decisions based on scores from a second, 
parallel form of the same test. Consistency can be evaluated directly from actual responses to test 
items if two complete and parallel forms of the test are given to the same group of students. In 
operational test programs, however, such a design is usually impractical. Instead, techniques have 


been developed to estimate both the accuracy and the consistency of classification decisions based on 
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a single administration of a test. The Livingston and Lewis (1995) method was used for the NYSAA 
because it is easily adaptable to all types of testing formats. 

The accuracy and consistency estimates reported in the following tables make use of “true 
scores” in the classical test theory sense. A true score is the score that would be obtained if a test had 
no measurement error. Of course, true scores cannot be observed and, therefore, must be estimated. 
In the Livingston and Lewis method, estimated true scores are used to categorize students into their 
“true” classifications. 

For the NYSAA, after various technical adjustments (described in Livingston & Lewis, 1995), a 
four-by-four contingency table of accuracy was created for each content area and grade, where cell [7 J] 
represented the estimated proportion of students whose true score fell into classification 7 (where 7= 1 
to 4) and observed score into classification 7 (where /= 1 to 4). The sum of the diagonal entries (i.e., the 
proportion of students whose true and observed classifications matched) signified overall accuracy. 

To calculate consistency, true scores were used to estimate the joint distribution of 
classifications on two independent, parallel test forms. Following statistical adjustments per Livingston 
and Lewis (1995), a new four-by-four contingency table was created for each content area and grade, 
and was populated by the proportion of students who would be categorized into each combination of 
classifications according to the two (hypothetical) parallel test forms. Cell [7 7] of this table represented 
the estimated proportion of students whose observed score on the first form would fall into classification 
i (where /= 1 to 4) and whose observed score on the second form would fall into classification 7 (where 
j= 1 to 4). The sum of the diagonal entries (i.e., the proportion of students categorized by the two forms 
into exactly the same classification) signified overall consistency. 

Another way to measure consistency is to use Cohen’s (1960) coefficient « (kappa), which 
assesses the proportion of consistent classifications after removing the proportion of consistent 
classifications that would be expected by chance. It is calculated using the following formula: 

__ (Observed agreement)-(Chance agreement) _ Di Cii~DiCi.Ci 


K 
1-(Chance agreement) 1-YjiCi.Ci 


where 

C,, is the proportion of students whose observed performance level would be Level / (where /= 1 — 4) on the 
first hypothetical parallel form of the test, 

C, is the proportion of students whose observed performance level would be Level / (where /= 1 — 4) on the 
second hypothetical parallel form of the test, and 


C;; is the proportion of students whose observed performance level would be Level / (where /= 1 — 4) on 
both hypothetical parallel forms of the test. 
Because xis corrected for chance, its values are lower than those of other consistency estimates. 
The accuracy and consistency analyses described above are provided in Table 10-2. The table 
includes overall accuracy and consistency indices, including kappa. Accuracy and consistency values 


conditional upon performance level are also given. For these calculations, the denominator is the 
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proportion of students associated with a given performance level. For example, the conditional 
accuracy value is 0.83 for Meeting Learning Standards for Grade 3 ELA. This figure indicates that 
among the students whose true scores placed them in this classification, 83% would be expected to be 
in this classification when categorized according to their observed scores. Similarly, a consistency 
value of 0.79 indicates that 79% of students with observed scores in the Meeting Learning Standards 


level would be expected to score in this classification again if a second, parallel test form were used. 


Table 10-2. 2014-15 NYSAA: Summary of Decision Accuracy (and Consistency) Results 

by Content Area and Grade—Overall and Conditional on Performance Level 

Conditional on Level 

Content Area Grade Overall Kappa Not Partially 
Meeting Meeting 
3 0.80 (0.73) 0.55 0.72(0.55) 0.68 (0.58) 0.83(0.79) 0.86 (0.72) 
4 0.82 (0.75) 0.51 0.64(0.37) 0.69(0.57) 0.85 (0.82) 0.83 (0.66) 
English 5 0.81 (0.74) 0.49 0.65(0.37) 0.66(0.51) 0.84(0.81) 0.83 (0.67) 
6 
7 


Meeting with 


Meeting Distinction 


Language 0.82 (0.75) 0.54 0.69(0.47) 0.69(0.57) 0.85 (0.82) 0.85 (0.70) 
Arts 0.77 (0.69) 0.49 0.72(0.57) 0.49(0.38) 0.81 (0.78) 0.85 (0.70) 
8 0.78 (0.71) 0.55 0.78(0.68) 0.47 (0.36) 0.82(0.77) 0.87 (0.77) 

High School 0.82 (0.75) 0.61 0.77(0.65) 0.71 (0.61) 0.84 (0.80) 0.88 (0.78) 

g' 0.80 (0.73) 0.54 0.75(0.62) 0.61(0.50) 0.84(0.81) 0.86 (0.72) 

4 0.78 (0.71) 0.53 0.74(0.61) 0.57(0.46) 0.82(0.77) 0.87 (0.74) 

5 0.81 (0.74) 0.53 0.70(0.52) 0.63(0.51) 0.84(0.81) 0.85 (0.70) 

Mathematics 6 0.82 (0.75) 0.60 0.74(0.59) 0.72 (0.63) 0.83(0.79) 0.88 (0.78) 
7 0.78 (0.70) 0.49 0.68(0.49) 0.63(0.51) 0.81 (0.78) 0.84 (0.68) 

8 0.79 (0.71) 0.55 0.76(0.65) 0.62(0.51) 0.83(0.78) 0.86 (0.73) 

High School 0.79 (0.72) 0.53 0.71(0.56) 0.61(0.49) 0.82(0.79) 0.87 (0.73) 

4 0.76 (0.68) 0.36 0.59(0.33) 0.51(0.38) 0.81 (0.79) 0.81 (0.55) 

Science 8 0.76 (0.67) 0.34 0.56(0.26) 0.54(0.40) 0.79(0.77) 0.82 (0.54) 


High School 0.75 (0.67) 0.41 0.59(0.34) 0.55(0.42) 0.77(0.75) 0.85 (0.64) 
Social Studies High School 0.76 (0.67) 0.48 0.63(0.40) 0.62(0.50) 0.71 (0.66) 0.88 (0.77) 


For some testing situations, the greatest concern may be decisions around level thresholds. For 
example, in testing done for No Child Left Behind (NCLB) Act of 2001 accountability purposes, the 
primary concern is distinguishing between students who are proficient and those who are not yet 
proficient. In this case, the accuracy of the Partially Meeting/Meeting threshold is of greatest interest. 
Table 10-3 provides accuracy and consistency estimates at each cutpoint, as well as false positive and 
false negative decision rates. (A false positive is the proportion of students whose observed scores 
were above the cut and whose true scores were below the cut. A false negative is the proportion of 
students whose observed scores were below the cut and whose true scores were above the cut.) 

The indices described above are derived from Livingston and Lewis’s (1995) method of 
estimating the accuracy and consistency of classifications. It should be noted that Livingston and Lewis 


discuss two versions of the accuracy and consistency tables. A standard version performs calculations 
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for forms parallel to the form taken. An “adjusted” version adjusts the results of one form to match the 
observed score distribution obtained in the data. The tables on the previous pages use the standard 
version for two reasons: (1) This “unadjusted” version can be considered a smoothing of the data, 
thereby decreasing the variability of the results; and (2) for results dealing with the consistency of two 
parallel forms, the unadjusted tables are symmetrical, indicating that the two parallel forms have the 
same statistical properties. This second reason is consistent with the notion of forms that are parallel; 
that is, it is more intuitive and interpretable for two parallel forms to have the same statistical 
distribution. 

Note that, as with other methods of evaluating reliability, DAC statistics calculated based on 
small groups can be expected to be lower than those calculated based on larger groups. For this 
reason, the values presented in the following tables should be interpreted with caution. In addition, it is 
important to remember that it is inappropriate to compare DAC statistics between grades and content 


areas. 
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Table 10-3. 2014-15 NYSAA: Summary of Decision (and Consistency) Results 
by Content Area and Grade—Conditional on Cutpoint 


Not Meeting / Partially Meeting / Meeting / 
Partially Meeting Meeting Meeting with Distinction 
Content Area Grade Accuracy False Accuracy False Accuracy False 
(consistency) Positive Negative (consistency) Positive Negative (consistency) Positive Negative 

3 0.97 (0.96) 0.01 0.02 0.91 (0.88) 0.04 0.04 0.91 (0.88) 0.06 0.03 
4 0.99 (0.99) 0.00 0.01 0.92 (0.89) 0.04 0.04 0.91 (0.88) 0.07 0.02 
English 5 0.99 (0.98) 0.00 0.01 0.93 (0.90) 0.03 0.04 0.90 (0.86) 0.07 0.03 
Language 6 0.98 (0.98) 0.00 0.01 0.92 (0.89) 0.04 0.04 0.91 (0.88) 0.06 0.02 
Arts 7 0.95 (0.93) 0.02 0.03 0.91 (0.87) 0.04 0.05 0.90 (0.87) 0.07 0.03 
8 0.95 (0.92) 0.02 0.03 0.92 (0.88) 0.04 0.04 0.91 (0.88) 0.06 0.03 
High School _0.97 (0.96) 0.01 0.02 0.93 (0.90) 0.04 0.04 0.92 (0.90) 0.05 0.02 
3 0.96 (0.94) 0.02 0.02 0.92 (0.88) 0.04 0.04 0.92 (0.89) 0.05 0.02 
4 0.96 (0.94) 0.02 0.02 0.91 (0.88) 0.04 0.04 0.91 (0.88) 0.06 0.03 
5 0.97 (0.96) 0.01 0.02 0.92 (0.89) 0.04 0.04 0.91 (0.88) 0.06 0.02 
Mathematics 6 0.98 (0.97) 0.01 0.01 0.92 (0.89) 0.04 0.04 0.92 (0.89) 0.05 0.03 
7 0.97 (0.96) 0.01 0.02 0.91 (0.87) 0.05 0.05 0.90 (0.87) 0.07 0.02 
8 0.95 (0.93) 0.02 0.03 0.91 (0.87) 0.05 0.04 0.93 (0.90) 0.05 0.02 
High School 0.97 (0.95) 0.01 0.02 0.92 (0.89) 0.04 0.04 0.91 (0.87) 0.07 0.03 
4 0.97 (0.96) 0.01 0.02 0.90 (0.86) 0.05 0.06 0.89 (0.85) 0.09 0.02 
Science 8 0.98 (0.97) 0.00 0.02 0.90 (0.86) 0.05 0.06 0.87 (0.83) 0.11 0.02 
High School 0.98 (0.97) 0.01 0.02 0.91 (0.88) 0.04 0.05 0.86 (0.82) 0.11 0.03 
Social Studies High School 0.98 (0.97) 0.01 0.01 0.91 (0.88) 0.04 0.04 0.86 (0.81) 0.10 0.04 


2014-15 NYSAA Technical Report: Chapter 10—Characterizing Errors Associated with Test Scores. - 56 - 


10.4 INTERRATER CONSISTENCY 


Chapter 8 of this report describes in detail the processes that were implemented to monitor the 
quality of the hand-scoring of student responses for polytomous items. One of these processes was 
double-blind scoring of all student responses. Results of the double-blind scoring were used during 
scoring to identify Scorers who required retraining or other intervention, and are presented here as 
evidence of the reliability of the NYSAA. A summary of the interrater consistency results is presented in 
Table 10-4. Results in the table are collapsed across the tasks by content area and grade. The table 
shows the number of included scores, the percent exact agreement, the correlation between the first 
two sets of scores, the mean absolute difference between scores that did not have exact agreement, 
and the standard deviation of these absolute differences. This same information is provided at the item 


level in Appendix E. 


Table 10-4. 2014-15 NYSAA: Summary of Interrater Consistency Statistics 
Collapsed Across Items by Content Area and Grade 


Overall Overall Overall Mean Overall 
Subject rate N Interrater Interrater Absolute S.D. 
Percent Exact Correlation Difference for Absolute 
Agreement Agreement Non-Exact Difference 
3 2,560 98.55 1.00 2.88 2.80 
4 2,971 98.35 1.00 2.19 2.58 
English 5 3,050 98.39 1.00 2.75 1.97 
Language 6 2,993 98.40 1.00 2.48 2.42 
Arts 7 3,161 98.67 1.00 3.65 3.07 
8 3,140 98.85 1.00 3.34 2.50 
High School 2,850 98.53 1.00 2.57 2.65 
3 2,546 98.90 1.00 2.35 2.40 
4 2,992 98.36 1.00 2.50 2.46 
5 3,050 98.07 1.00 3.02 2.86 
Mathematics 6 3,000 97.47 1.00 2.49 2.38 
7 3,186 97.77 1.00 2.17 1.80 
8 3,180 98.40 1.00 2.62 2.83 
High School 2,809 97.76 1.00 3.08 2.83 
4 1,189 98.65 0.99 4.88 3.80 
Science 8 1,288 98.29 1.00 2.65 2.41 
High School 1,161 97.93 1.00 2.60 2.12 
Social Studies — High School 1,153 98.87 1.00 2.43 1.73 
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CHAPTER 11 COMPARABILITY (SCALING AND EQUATING) 


11.1 COMPARABILITY OF SCORES ACROSS YEARS 


In administering the New York State Alternate Assessment (NYSAA), teachers select 
Extensions or Alternate Grade-Level Indicators (AGLIs), following the Test Blueprints. Use of the 
Extensions or AGLIs and Blueprints ensures that the assessment, as it is administered, is appropriate 
for the individual needs of the student being assessed and that the standards required are covered. 
The process enables teachers to customize the assessment for individual students while at the same 
time ensuring comparability across years through the use of the same Blueprints, Extensions/AGLIs, 
and Scoring Rubrics from year to year. Additionally, comparability is ensured through the scoring 
process. Teachers use the same Scoring Rubric for scoring datafolios each year, and scoring occurs at 
regional Scoring Institutes that all follow the same Scoring Training program and Scoring Procedures, 
as well as the standard scoring Quality Control Processes, as described in Chapter 8. Additional 
processes to ensure across-year comparability include calculation of reported scores and 


categorization into achievement levels, as described below. 


11.1.1 Standard Setting 


Standard setting was conducted in June 2014 to establish cut scores for the scaled total scores 
for each alternate performance level in English Language Arts (ELA) and mathematics, Grades 3 
through 8 and high school; in science, Grades 4 and 8 and high school; and in high school social 
studies. To ensure continuity of score reporting across years, the cuts that were established at the 
standard-setting meeting will continue to be used in future years, until it is necessary to reset 
standards. The scaled total score cutpoints for the NYSAA as established via standard setting are 


presented in Table 11-1. 


Table 11-1. 2014-15 NYSAA: Cut Scores on Reporting Scale 
by Content Area and Grade 


Epnienee eid Scaled Score Cuts Scaled Score 
SUNGEE Pure fees Cut1 Cut2 Cut3 Minimum Maximum 
3 426 449 491 400 525 
English 4 421 445 489 400 525 
Language 5 424 444 484 400 525 
Arts 6 425 447 490 400 525 
7 429 443 486 400 525 
continued 
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Scaled Score Cuts Scaled Score 


COneneMes eae Cut1 Cut2 Cut3 Minimum Maximum 

English 8 431 442 480 400 525 
Language : 

hac High School 427 449 490 400 525 
3 432 450 493 400 525 
4 428 445 488 400 525 
5 423, 441 485 400 525 
Mathematics 6 422 445 485 400 525 
7 425 447 492 400 525 
8 426 443 483 400 525 
High School 425 446 496 400 525 
4 559 567 589 550 600 
Science 8 559 569 591 550 600 
High School 559 569 £591 550 600 
Social Studies High School 559 570 = 586 550 600 


Table F-1 in Appendix F shows performance level distributions for 2014—15 by content area and grade. 


11.1.2 Reported Scores (Cumulative Distributions) 


Students’ entry scores are calculated based on their Level of Accuracy and Level of Complexity 
scores for each of the final date of student performance of the Extensions or AGLIs in a given entry. 
The overall score is then the sum of the entry scores. Using this formula, there may be multiple ways 
that a student can attain a given total score. 

Graphs of the cumulative reported raw score distributions for 2014—15 are provided in Appendix 


G. As the curves move to the right, they represent an increase in performance. 


11.1.3 Performance Level Distributions 


Appendix F shows the percentages of students earning scores at each performance level. A 
score of No Score (NS) is designated if a datafolio does not adhere to the administration guidelines. 
(Complete information regarding scoring can be found in the two scoring documents titled Steps for 
Scoring 2014—15 NYSAA Datafolios and Decision Rules for Scoring 2014—15 NYSAA Datafolios.) The 


percentages are presented by grade, content area, and performance level. 


11.2 LINKAGES ACRoss GRADES 
In developing the NYSAA, a content-based approach for addressing continuity across grades 


was implemented. Specifically, issues of continuity were addressed in the following processes: (1) 
development, (2) administration, and (3) standard setting. 
As explained in Chapter 4, the Extensions and AGLIs describe the content to be included in 


students’ instructional programs for each grade level. The Extensions in ELA and mathematics are 
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based on the Common Core Learning Standards (CCLS), and AGLIs in science and social studies are 
based on the core curriculum’s Grade-Level Expectations, but have been reduced in depth and 
breadth. The Extensions and AGLIs are designed to follow a developmental continuum of skills that 
increases across grades. Each Assessment Task must align to the Extension or AGLI, and each is 
designed to measure grade-specific content and skills. These Assessment Tasks and the Extensions or 
AGLIs, along with the Test Blueprints, were designed to mirror the developmental continuum reflected 
in the Extensions and AGLIs and to ensure that each datafolio builds on the appropriate knowledge and 
skills, thereby reflecting the desired continuity across grades. 

During administration, the Test Blueprint serves teachers as a guide to selecting Extensions and 
AGLIs that are appropriate for a given student. In addition, teachers must select Assessment Tasks that 
are aligned with the chosen Extensions and AGLIs. As with other aspects of the development and 
administration of the NYSAA, use of the Test Blueprints and the Extensions and AGLIs ensures that the 
student is being assessed at a level that is appropriate for his or her individual needs and that the 
Extensions or AGLIs and Assessment Tasks to which students are exposed follow a developmentally 
appropriate continuum from year to year. Thus, linkages across grades are built into the design of the 
datafolio. 

Finally, the continuity of the NYSAA across grades was further verified through the standard- 
setting procedures. The achievement level descriptors used for standard setting were based on the 
student expectations as delineated in the Extensions and AGLIs. Proficiency across grades, therefore, 
was expected to follow the developmental continuum established by the Extensions or AGLIs, and thus 


to reflect a higher level of cognition as the grades increased. 


2014-15 NYSAA Technical Report: Chapter 11—Comparability (Scaling and Equating) - 60 - 


CHAPTER 12 VALIDITY 


12.1 PROCEDURAL VALIDITY 


To ensure the consistency of the information given to teachers across New York State, sets of 


documents and training programs were developed and distributed statewide. New York State has a 


group of Alternate Assessment Training Network Specialists (AATN Specialists) and Score Site 


Coordinators (SSCs) who present a turnkey training provided to them by the New York State Education 


Department (the Department) and Measured Progress. 


For the administration of the New York State Alternate Assessment (NYSAA), the materials 


included the following: 


NYSAA Administration Manual: This document contains all of the guidelines and specific 
requirements of the NYSAA; all of the forms required to be used in the datafolio; and the 
Test Blueprints, Extensions for English Language Arts (ELA) and mathematics, Alternate 
Grade-Level Indicators (AGLIs) for science and social studies, and Assessment Tasks 
for each Extension or AGLI for each grade level and content area. 


Training program DVD: The entire Administration Training program that is used with 
teachers is contained in this recorded program. All AATN Specialists are required to use 
the DVD in its entirety to train teachers. It ensures that the exact same message is 
imparted statewide. 


Training program slides and handouts: All slides and handouts developed by the 
Department and Measured Progress are required to be used by the AATN Specialists 
while training teachers. The handouts include slide printouts and Guided Practice 
activities. 


For the scoring of the NYSAA, the materials included the following: 


Steps for Scoring 2014-15 NYSAA Datafolios and Decision Rules for Scoring 2014—15 
NYSAA Datafolios: These are the two main documents used to guide the process for 
scoring each datafolio (see Appendices B and C). 


Training program DVD: The entire Scoring Training program that is used with Scorers is 
contained in this recorded program. All SSCs and AATN Specialists are required to use 
the DVD in its entirety to train Scorers. It ensures that the exact same message is 
imparted statewide. 


Datafolio practices and qualifiers: All Scorers must complete the three practice samples 
provided and then must qualify by scoring datafolio samples. All Scorers are qualified 
using calibrated materials that were initially identified during a Benchmarking process. 
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12.2 CONTENT VALIDITY 


The Standards for Educational and Psychological Testing (AERA et al., 2014) notes that an 
important part of establishing test validity is ensuring that a close, substantive relationship exists 
between a test’s content and the underlying construct that it is intended to measure. The Standards 
further elaborate that the test content refers to the “themes, wording, and format of the items, tasks, or 
questions on a test. Administration and scoring may also be relevant to content-based evidence” (2014, 
p. 14). In addition to describing the content in detail, content validity evidence must, of course, relate 
the content to the construct that the test is intended to measure. One important approach in this regard, 
mentioned in the Standards, is the use of “expert judgment of the relationship between parts of the test 
and the construct” (2014, p. 14). 

The New York State (NYS) Learning Standards provide the framework for the New York State 
Testing Program, including the NYSAA. For ELA and mathematics, the standards are from the NYS 
Common Core Learning Standards (CCLS). For science and social studies, the standards are from the 
NYS Learning Standards. These standards are the constructs that are intended to be measured by the 
NYSAA. Chapters 4 through 6 of this report describe in detail the development and design of the 
content for the NYSAA, with special emphasis on the relationship of the test content to the standards. 
Chapter 8 provides a detailed description of the scoring process for the NYSAA, again emphasizing 
that the procedures used ensure strong adherence to the standards. Another important component of 
the Scoring Procedure is the standard-setting process, in which expert judgment is used to set the 
scores on the test that correspond to different levels of classification of student achievement relative to 
the standards. The Standard Setting Report documenting the June 2014 standard-setting meeting 
describes the rigorous procedures that were followed to ensure that the content-related aspects of the 
standard setting maintained a strong, substantive alignment with the standards. 

As shown from the above definition of construct validity and in the descriptions of the contents 
of Chapters 4, 5, 6, and 8 of this report, a complete description of the content validity of the NYSAA is 


available to the reader. 


12.3 CONSEQUENTIAL VALIDITY 


Beginning in 1997, the Department began discussions on how to provide students who have 
severe cognitive disabilities access to the general education standards. To that end, an advisory 
committee made up of New York State Special Education Teacher and general education teacher 
committees was formed. Its goal was to develop a handbook that would provide teachers with an 
alternate pathway for this group of students to gain access to the NYS Learning Standards. On July 17, 


1997, the New York State Board of Regents endorsed a set of Alternate Performance Indicators (APIs) 
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that were linked to the NYS Learning Standards. The purpose of the APIs was to provide teachers with 
a way of teaching academic content to students with severe cognitive disabilities. The final manual, 
titled The Learning Standards and Alternate Performance Indicators for Students with Severe 
Disabilities, was published in 1998 and distributed statewide. 

As mandated in the reauthorized Individuals with Disabilities Education Act of 1997 (IDEA of 
1997), states were required to have an alternate assessment in place by July 2000 for those students 
who could not participate in the general education assessments, even with accommodations. Because 
of the groundbreaking work already done, the Department, in collaboration with Measured Progress 
and under the guidance of the advisory committee, endorsed the use of the APIs in 1997 as a way to 
measure the knowledge, skills, and understanding of students with severe cognitive disabilities against 
the NYS Learning Standards. The advisory committee concluded that all students must be given the 
Opportunity to achieve the NYS Learning Standards, but that not all standards are appropriate for this 
group of students, which was in line with the intent of the IDEA of 1997. It was understood that this 
group of students would be assessed against APIs because of their inability to participate in the general 
assessments, even with accommodations. The APIs, while based on the NYS Learning Standards, 
were, by their very nature, functional and limited to students with severe cognitive disabilities. They 
reflected what was determined to be appropriate for this group of students. They were not grade 
specific, nor were they aligned to grade-level content. The Committees on Special Education (CSEs) 
determined which students were appropriate for inclusion in the NYSAA, based on several strict criteria, 
and decided on which APIs the students would be assessed. The first NYSAA was piloted between 
March 1998 and March 2000, with full implementation during the 2000—01 school year. The purpose of 
the NYSAA was to promote the inclusion of students with severe cognitive disabilities in the statewide 
assessment program. It was not for the purposes of Adequate Yearly Progress (AYP) as defined by the 
No Child Left Behind (NCLB) Act of 2001. 

The following is the calendar of events that the Department followed to develop and implement 


its first alternate assessment. 


Spring 1998 Conduct regional training for teachers on the APIs 

March 1998—March 2000 _ Develop and pilot the alternate assessment system 

March—June 2000 Provide information and training on the alternate assessment 
system 

July 2000 Implement a statewide alternate assessment system as required by 


the IDEA of 1997 


June 2001 Collect data and report student scores to the public 


The Department and its Special Education Teacher and general education teacher committees 


were committed to building an assessment and accountability system that included students with 
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severe cognitive disabilities. New York State was one of the first states to engage teachers, 
administrators, policymakers, and others in these important discussions, and it did pioneering work in 
the early years of alternate assessment. 

With the reauthorization of the NCLB Act of 2001, students in every state are being held to high 
levels of academic achievement, including students with severe cognitive disabilities. The original 
NYSAA tested students in Grades 4 and 8 and high school in the content areas of ELA, mathematics, 
science/health, and social studies. Based on new grade-level testing requirements in NCLB, in 
September 2005 the Department began to implement a revised NYSAA that included Grades 3 through 
8 and high school in the content areas of ELA, mathematics, science, and social studies. The students 
were assessed against the original APIs; however, the format and the number of APIs assessed were 
modified. Table 12-1 outlines the revised NYSAA. 


Table 12-1. 2014-15 NYSAA: Revised NYSAA 
Grades 3-8 and High School 


Grade Equivalents 


Datafolio Component Anchor Expanded 


4, 8, and High School 


3, 5, 6, and 7 


Table of contents 
Student Page 


One Entry Cover Sheet for each 
content area 


One Data Summary Sheet for each 


content area 


Verifying evidence per API 


Permission to tape and photograph 


Digital Video and Audio Clip 
Summary form 


v 
v 


English Language Arts, 
mathematics, social studies, 
science 


4 (one for each content area 
above) 


1 piece per API in each content 
area 


If applicable 
If applicable 


v 
v 


English Language Arts, 
mathematics 


2 (one for English Language 
Arts, one for mathematics) 


3 pieces for mandatory API in 
English Language Arts and 
mathematics 


If applicable 
If applicable 


During the 2005-06 testing cycle, the Department submitted its accountability documentation for 
peer review to the U.S. Department of Education. The results of that review required the Department to 
revise its alternate assessment to ensure: 

» the presence of evidence of alignment between the NYSAA alternate achievement 
standards and the newly adopted Grade-Level Expectations; 
« that students are assessed at each required grade; 


« the setting of cutpoints and the development of Alternate Performance Level Descriptors 
(APLDs) for each grade level and content area; and 
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» the technical quality of the assessment, including research-based standard setting, and 
the production and submission of the Standard Setting Report and Technical Report. 


The new assessment system had to be in place for the 2006-07 testing cycle, culminating with 
standard setting in June 2007. 

Beginning in July 2006, the Department, in collaboration with Measured Progress, redesigned 
the NYSAA. The focus and purpose of the assessment is to ensure that students with severe cognitive 
disabilities are being provided access to the general education curriculum (e.g., Grade-Level 
Expectations). However, for these students, Grade-Level Expectations need to be expanded in both 
breadth and depth. This resulted in development of the AGLIs, which were contained in the NYSAA 
Administration Manual: Appendix H—NYSAA Frameworks. 

The Department brought together groups of Special Education Teacher and general education 
teacher committees, including general education content specialists and Special Education Teachers, 
to develop the AGLIs. The groups referred to the general education Test Blueprints to determine the 
academic core priorities. From there, each content group reviewed the Grade-Level Expectations for 
each grade level and content area. The groups determined the Essences of the Grade-Level 
Expectations. Lastly, the groups wrote AGLIs that were aligned to the Essences of the Grade-Level 
Expectations. In addition to developing the AGLIs, Special Education Teacher and general education 
teacher committees were also brought together to develop Sample Assessment Tasks (SATs) aligned 
to the AGLIs. The following year, the Stakeholder groups were brought in again to further refine what 
was originally developed. 

The new NYSAA was first administered in late fall 2006. This abbreviated administration period 
culminated with regional Scoring Institutes. Standard setting was conducted in June 2007, resulting in 
cut scores for each grade level and content area, as well as in APLDs. The cut scores were approved 
by the Commissioner of Education and submitted, along with the Standard Setting Report, to the U.S. 
Department of Education. The 2007-08 NYSAA administration was a full administration period, and 
was based on the refined AGLIs and SATs. This administration, too, culminated with the regional 
Scoring Institutes. Standard setting was conducted on the revised AGLIs in June 2008, resulting in new 
cut scores and updated APLDs for each grade level and content area. The Commissioner of Education 
approved the updated cut scores in June 2008. The intent of the AGLIs was not changed following the 
2007-08 administration; therefore, the cut scores established during the June 2008 standard setting 
remained consistent for each grade level and content area. 

The New York State Board of Regents committed to the CCSS in January 2010 and formally 
adopted the CCSS for ELA and mathematics in July 2010 and incorporated New York State-specific 
additions, creating the New York State P-12 CCLS in January 2011. The Board of Regents announced 
that, for students with severe cognitive disabilities, student progress on the CCLS would be measured 
beginning with the 2014—15 administration of the NYSAA in ELA and mathematics. 
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To align the ELA and mathematics NYSAA with the CCLS, the Department began work in 
November 2011 to develop an alternate assessment to measure the CCLS for ELA and mathematics 
for students with severe cognitive disabilities. The Department brought together teacher committees 
made up of general education content specialists and Special Education Teachers, to review the CCLS 
for the content identified in the new Test Blueprint, develop Essence statements to narrow the depth 
and breadth of the CCLS, and draft Extensions. The new Test Blueprint for the NYSAA was based on 
the general education Test Blueprints to determine the academic core priorities. The Test Blueprints for 
the NYSAA that measure the Extensions to the CCLS in ELA and mathematics were approved in spring 
2012. The draft Essences and Extensions were reviewed extensively during the summer of 2012, and 
draft documents were posted for public comment in September 2012. In October 2012, the committees 
were reconvened to review the revisions to the Essences and Extensions, and to draft Assessment 
Tasks to measure student performance of the CCLS. Following the meeting, the draft Assessment 
Tasks were reviewed and vetted by content and Special Education Teachers, and then were posted for 
public comment from December 2012 to January 2013. Public comments from the first review and the 
second review were incorporated, as appropriate, into the draft Extensions and draft Assessment 
Tasks. 

The administration procedures for the NYSAA were revised to a new test design that 
emphasized a continuum of student performance and intentionally focused on increasing the validity of 
test scores. The procedures were streamlined, where possible, to reduce teacher clerical errors. The 
2014-15 administration procedures applied to all content areas. This was done so that teachers would 
not have to use two completely different assessment procedures. The new NYSAA was first 
implemented in 2014—15. The administration culminated with regional Scoring Institutes. Standard 
setting was conducted in June 2014, resulting in cut scores for each grade level and content area, as 
well as in APLDs. The cut scores were approved by the Commissioner of Education, following standard 
setting. 

The information in this section and throughout the Technical Report provides a framework to 
determine the consequential validity of the NYSAA. In order to demonstrate consequential validity, the 


assessment should: 


"provide multiple measurement occasions, 
» show that student results are improving, and 
"demonstrate that revisions to the NYSAA are considered based on Stakeholder 
feedback. 
The revised NYSAA demonstrates that students are provided multiple measurement occasions 
as embedded in the baseline and final data collection points. Also, Stakeholder input has been critical 


throughout the development and revision processes. 
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APPENDIX A—NYSAA TEST BLUEPRINTS FOR 
EACH CONTENT AREA 
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New York State Alternate Assessment (NYSAA) 


Test Blueprint for English Language Arts (ELA) 


ENGLISH LANGUAGE 
ARTS (ELA) 


Strand 


Sub-Strand 


Grade 
3 


Grade 
4 


Grade 
5 


Grade 
6 


Grade 


Grade 
8 


High 
School 


Reading Standards for 


Literature 


Reading Standards for 
Informational Text 


Writing Standards 


Speaking 


and 
Listening 


Language 
Standards 


Key Ideas and Details 


X 


Craft and structure 


Integration of 
Knowledge and Ideas 


Responding to Literature 


Key Ideas and Details 


Craft and Structure 


Integration of 
Knowledge and Ideas 


Key Ideas & Integration 
of Knowledge and Ideas 


Text Types and 
Purposes 


Production and 
Distribution of Writing 


Research to Build and 
Present Knowledge 


Comprehension and 
Collaboration 


Presentation of 
Knowledge and Ideas 


Conventions of Standard 
English 


Knowledge of Language 
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Vocabulary Acquisition 
and Use 


New York State Alternate Assessment (NYSAA) 
Test Blueprint for Mathematics 


Grade | Grade | Grade | Grade | Grade | Grade | High 


Domain 3 4 5 6 7 8 School 
Operations and Algebraic 

Thinking (OA) Xx x xX 

Number and Operations in Base 

Ten (NBT) xX x Xx 

Number and Operations — 

Fractions (NF) Xx Xx Xx 

Measurement and Data (MD) X Xx Xx 

Geometry (G) Xx Xx Xx 


Ratios and Proportional 
Relationships (RP) 


The Number System (NS) 


Expressions and Equations 
(EE) 


Functions (F) 


Statistics and Probability (SP) 


Quantities (NQ) 


xX 
Creating Equations (A-CED) x 
Interpreting Functions (F-IF) X 
Expressing Geometric 
Properties with Equations (G- X 
GPE) 
Interpreting Categorical and 
Quantitative Data (S-ID) Xx 
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NYSAA Test Blueprint - Science 
Effective with 2014-15 Administration 


Two Standards are assessed for each Grade as Marked by an X 


Grade Grade High 
Standards Chapter 5. Key Idea A 8 mia 


2- Beyond the use of reasoning and 
consensus, scientific inquiry 
involves the testing of 
explanations involving the use of 

; ; conventional techniques and 

1 — Analysis, Inquiry, procedures and usually requiring 


and Design (Scientific considerable ingenuity. 


Inquir 
quiry) 3- The observations made while 


testing proposed explanations, 
when analyzed using conventional 
and invented methods, provide 
new insights into phenomena... 


Living things are both similar to 
and different from each other and 


4- Living Environment from nonliving things. 


3- Individual organisms and species 
change over time. 


Many of the phenomena that we 
observe on Earth involve 
interactions among components of 
4- Physical Setting/ air, water, and land. 

Earth Science 3- Matter is made up of particles 
whose properties determine the 
observable characteristics of 
matter and its reactivity. 


*Note: See the Core Curricula for Science at http://www.p12.nysed.gov/ciai/cores.html#MST. 
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NYSAA Test Blueprint - Social Studies (HS only) 
Effective with 2014-15 Administration 


Two Standards are assessed for each Grade as Marked by an X 


High 


t nit 
Standards U nits Scheel 


1- US History 2 - Constitutional Foundations X 


2- World History: 
Global History 8 - Global Connections and Interactions 
and Geography 


See the Core Curricula for Social Studies at: 


http://Awww.p12.nysed.gov/ciai/cores.html#SOCIALSTUDIES 
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APPENDIX B—NYSAA Scoring Procedures 
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Procedures for Scoring NYSAA Datafolios /2014-15, 


e Follow the steps outlined below to review each NYSAA datafolio. 
e Review the documentation to determine the answer to the question/statement for each step. 
e If adiscrepancy is not addressed in this document, consult your Table Leader. 


e Prior to the Scorer recording the error, a Table Leader MUST review and confirm all issues that may 
result in a “No” for any of the three Connections questions, a “No Score” for a date(s), and/or an 
adjustment(s) to the Data Summary Sheet (DSS). 


1. Student Demographics, Scorer ID, Scoring Institute Code 
a) Is the student demographic information consistent? 


Student demographic information must be consistent between the demographic label (from the RIC), 
Student Page (in datafolio), and Scannable Score Document. If discrepant or if scannable is missing, 
consult the Table Leader. Record Scorer comment A at the bottom of the Scorer Worksheet. 


b) Apply label in the upper left corner on each page of the Scorer Worksheet. If a label is not 
available, transcribe the information from the Student Page to the Scorer Worksheet. 


c) Fill in your Scorer ID and the Scoring Institute Code. 


Enter your 3-digit Scorer Identification Number and 6-digit Scoring Institute Code in the upper-right 
corner of the Scorer Worksheet. 


d) Is the student’s DOB within the range indicated on the Student Page for the grade assessed? 
i. lf Measured Progress ProFile™ was used - accept the grade level as correct. 
ii. Note: If a DOB is outside the range specified for any grade level, consult your Table Leader. 


If YES > Mark the grade assessed in the upper right corner of the Scorer Worksheet. 


lf NO > Wrong grade level was assessed. Record: 

Consult e Extension/AGLI code 00099 

the Table e “N’ for No for all Connections questions for each 
Leader Extension/AGLI within the content areas that should 


have been assessed 
e “NS” for No Score for baseline and final dates 
e Procedural Error comment 1 


e) Are there any Testing Accommodations listed on page two of the Student Page? 
If YES > _ | Transcribe any Testing Accommodations to the Scannable Score Document. 
lf NO > Continue to review and score the assessment. 


If page two of the Student Page is missing or incomplete, continue to review and score the 
assessment. 


f) Was a Collegial Review month indicated on the Student Page? 


If YES > Record: 
e “Y” for Yes for “Was a collegial review of this datafolio conducted?” on the Scannable Score 
Document. 
lf NO > Record: 
e “N’ for No for “Was a collegial review of this datafolio conducted?” on the Scannable Score 
Document. 
e Scorer comment B, at the bottom of the Scorer Worksheet 
Continue to review and score the assessment. 


g) Set aside the Scannable Score Document until all content areas have been reviewed and 
scored. 


Datafolios are scored in order: ELA, mathematics, science and social studies. Review the 
entire datafolio to determine if anything is out of order. Do not reorganize the datafolio. A datafolio with out-of- 
order documents can be scored. 


2. Review Sequence of Documentation for Content Area 


a) Are the required Data Summary Sheets (DSSs) present, one for each standard assessed? (refer to 
DSS Titles in the upper-right corner [e.g., Looking at ELA and math, do you have 1, 2, 3, 4, and 5in 
titles; in science and social studies, do you have 1 and 2]) 


lf NO > DSS missing. Record: 

Consult e Extension/AGLI code 00099 

the Table e “N’ for No for each Connection question for the missing 
Leader Extension/AGLI 


e “NS” for No Score for baseline and final dates 
e Procedural Error comment 2 
Proceed to the next Extension/AGLI that has a DSS or next 
content area if all DSSs are missing. 
Two or more DSSs are found for | e Review and score the first DSS and assessment 


the same Standard and there documentation for the Extension/AGLI assessed 
are fewer than the required For the missing Extension/AGLI, record: 

number of Extensions/AGLIs e Extension/AGLI code 00099 

assessed. e “N’ for No for each Connection question 


e “NS” for No Score for baseline and final dates 
e Procedural Error comment 5 
Proceed to the next Extension/AGLI or the next content area. 


b) Are the DSS forms in order? 
Confirm that the DSSs are in the correct order using the titles (upper right corner) on each form. 


lfNO > Documents are out of order. e Consider documentation that is out of order and score the 
Consult assessment in the correct order. Do not reorganize the 
the Table datafolio. 

Leader Record: 


e Scorer comment C 
Proceed to Step 3. 


3. ls demographic information on DSS complete and accurate when compared to the 
Student Page? 


lf NO > Demographic information is e Transcribe information from the Student Page to the DSS in 
Consult discrepant or incomplete. red ink. 

the Table Record: 

Leader e Scorer comment D 


Proceed to Step 4. 


4. Extension/AGLI from Grade Level 
Is the Extension/AGLI indicated on the DSS from the student’s assessed grade? 


If YES > 


lf NOT 
SURE > 
Consult 
the Table 
Leader 


If NO > 
Consult 
the Table 
Leader 


e Measured Progress ProFile™ 
was used to complete datafolio 
documentation or 

e Extension/AGLI is from grade. 

Extension/AGLI is missing on the 

DSS, but can be located on VE. 


DSS for a standard includes an 
Extension/AGLI from an 
inappropriate grade level. 


5. Task Connects to Extension/AGLI 
Does the Assessment Task (AT) documented on the DSS clearly connect to the 
Extension/AGLI? (reference Extension and AT codes) 


If Measured Progress ProFile™ is not used and the Assessment Task is hand written, Scorer must 
verify code and text against the AT in the Frameworks to ensure text is exact and AT is from the same 
Level of Complexity as the Extension/AGLI. 


If YES > 


lf NOT 
SURE > 
Consult 
the Table 
Leader 


If NO > 
Consult 
the Table 
Leader 


e Measured Progress ProFile™ 
used to complete DSS or 

e Assessment Task clearly 
connects to selected 
Extension/AGLI. 

Assessment Task (code/text) is 

missing on the DSS, but can be 

located on VE. 


Assessment task does not 
connect to Extension/AGLI. 


Assessment task is missing and 
cannot be located on the VE 
(either evidence itself or VE 
label). 


Record: 

e Extension/AGLI code (5-digits) 

e “Y” for Yes for “Extension/AGLI from Grade” 

Proceed to Step 5. 

If Extension/AGLI code or text is found on the VE, and 
code/text matches the Frameworks, transcribe the information 
to the DSS in red ink and continue to review and score the 
assessment. 


Record: 

e Extension/AGLI code (5-digits) 

e “Y” for Yes for “Extension/AGLI from Grade” 
e Comment D2 

Proceed to Step 5. 

Record: 

e Extension/AGLI code 00099 

e “N” for No for each Connection question 

e “NS” for No Score for baseline and final dates 
e Procedural Error comment 4 

Proceed to next Extension/AGLI or content area. 


Record: 
e “Y” for Yes for “Task Connects to Extension/AGLI" 
Proceed to Step 6. 


If the Assessment Task code or text is found on the VE and 
the code/text matches the Frameworks, transcribe the 
information to the DSS in red ink and continue to review and 
score the assessment. 


Record: 

e “Y” for Yes for “Task Connects to Extension/AGLI” 

e Comment D3 

Proceed to Step 6. 

Record: 

e “N’ for No for “Task Connects to Extension/AGLI’ and 
remaining Connection question 

e “NS” for No Score for baseline and final dates 

e Procedural Error comment 6a 

Proceed to next Extension/AGLI or content area. 

Record: 

e “N’ for No for “Task Connects to Extension/AGLI’ and 
remaining Connection question 

e “NS” for No Score for baseline and final dates 

e Procedural Error comment 2 

Proceed to next Extension/AGLI or content area. 


6. Verifying Evidence (VE) Connects to Task (confirming only if VE is connected, not 
whether VE is valid) 
a) Is evidence for both the baseline and final data point found behind the DSS? (VE for two 
separate dates) 


Note: A single DCS may be considered as two pieces of VE. A calendar/chart can only be submitted as 
one piece of VE. 


If NOT If VE appears to be missing Review the entire datafolio to determine if missing piece of VE 
SURE > is misplaced. If VE is misplaced, leave it where it is found, 
Consult review and score the assessment. Do not reorganize the 
the Table datafolio. 

Leader Record: 


e Scorer comment C 
Proceed to Step 6b 


If MORE If more than baseline and e Only evidence for the two dates (baseline and final) 
THAN final VE are included documented on the DSS can be considered. 
TWO Note: Do not confuse this with | ¢ Also, if one or both pieces of evidence for this 
PIECES a student work product that is Extension/AGLI are found to be invalid (Step 8), other 
OF VE > multiple pages, or with evidence cannot be considered in its place. 
supporting evidence. e If date(s) on VE are discrepant with DSS, use VE for earliest 
date as baseline and latest date for final. 
Record: 


e Scorer comment G 
Proceed to Step 6b using only VE for baseline and final. 


lf NO > Only one piece of evidence Record: 

Consult is found e “N’ for No for “VE connects to task” 

the Table e “NS” for No Score for baseline and final dates 
Leader e Procedural Error comment 7a, 7b as appropriate 


Proceed to next Extension/AGLI or content area. 
b) Does the evidence for each data point connect to the Assessment Task documented on DSS? 


e Data Collection Sheets (DCS) must include step, trial or time segment information 
e Digital video and audio evidence must be accessible and can be reviewed by Scorer 
e Toconnect, each piece of evidence, on its own, must: 
v Meet the intent of task by demonstrating the student’s skill on the assessed task 
Y Not include information (e.g., directions, items) that conflicts with the vocabulary of task (e.g., 
main character = main character, main character # important character; reference content 
glossary to confirm) 
¥ Demonstrate any plural in the task (if parenthesis around “s,” plural requirement is optional) 
v Demonstrate any AND in the task by demonstrating all elements of the assessed task 


If YES > Evidence for both the baseline | Proceed to Step 6c. 
and final data point 
demonstrate the above 
criteria on their own. 


lf NOT If connection is not clear for If Table Leader confirms connection is demonstrated 
SURE > either the baseline or final Record: 

Consult the) data point Scorer comment E1 or E2 

Table 

Leader 

If NO> Evidence for baseline and/or Record: 

Consult final data point does not e “N’ for No for “VE Connects to Task” 

the Table connect to Assessment e “NS” for No Score for baseline and final dates 
Leader Task. e Procedural Error comment 8a, 8b, 8c, or 8d 


Proceed to next Extension/AGLI or content area. 


If NO > 
Consult the 
Table 
Leader 
(cont’d) 


DCS includes a single step or 
time-segment that does not 
clearly document the student 
performance for the assessed 
task (e.g., plural or AND) and 
there is no other information or 
notation to clarify how task was 
conducted. 


Assessment Task includes a 
plural without parenthesis 
around “s.” Upon review, one 
or both pieces of VE do not 
satisfy the plural. 


Assessment Task includes an 
AND statement. Upon review, 
one or both pieces of VE do 
not satisfy the AND 
statement. 


DCS included as VE is 
missing the step, trial 
(skill/sub-skill) or time- 
segment information. 


Digital video and/or audio 
malfunctioned or the clip is 
unable to be located on the 
DVD and/or CD. 


Record: 

e “N’ for No for “VE Connects to Task” 

e “NS” for No Score for baseline and final dates 
e Procedural Error comment 3a, 3b, or 3c 
Proceed to next Extension/AGLI or content area. 


Note: it is acceptable to use a multi-step DCS to demonstrate a 
single step task if all requirements are met 


Record: 

e “N’ for No for “VE Connects to Task” 

e “NS” for No Score for baseline and final dates 
e Procedural Error comment 8d 

Proceed to next Extension/AGLI or content area. 
Record: 

e “N’ for No for “VE Connects to Task” 

e “NS” for No Score for baseline and final dates 
e Procedural Error comment 8d 

Proceed to next Extension/AGLI or content area. 
Record: 

e “N’ for No for “VE Connects to Task” 

e “NS” for No Score for baseline and final dates 
e Procedural Error comment 16d 

Proceed to next Extension/AGLI or content area. 
Record: 

e “N’ for No for “VE Connects to Task” 

e “NS” for No Score for baseline and final dates 
e Procedural Error comment 15b or 15d 
Proceed to next Extension/AGLI or content area. 


c) Is the BASELINE Level of Accuracy 74% or below and calculated correctly? (below threshold) 


If YES > 


If NO > 
Consult 
the Table 
Leader 


lf NOT 
SURE > 
Consult the 
Table 
Leader 


On review of the VE for 
baseline, Level of Accuracy 
score is correct and confirmed 
to be 74% or below. 


Level of Accuracy score for 
baseline is 75% or higher (as 
documented or recalculated). 


Information on the VE is 
missing, contradicts or does 
not support what is 
documented for the Level of 
Accuracy on DSS. 


Record: 


e “Y” for Yes for “VE Connects to Task” 
Proceed to Step 7. 


Note: Do not record the Level of Accuracy at this step. That will 
be done in Step 10. 

Record: 

e “N’ for No for “VE Connects to Task” 

e “NS” for No Score for baseline and final dates 

e Procedural Error comment 8f 

Proceed to next Extension/AGLI or content area. 


e If Level of Accuracy is missing on DSS, but score can be 
found and verified on VE, transcribe missing percentage to 
DSS and continue to score the assessment. Record Scorer 
comment D5. 


e = If error in calculation is clear 
For example: 
Math example: “4 + 2 = 6” is marked incorrect by the 
teacher but is clearly correct. 


Adjust the baseline score on the DSS in red ink. If 
recalculation forces baseline score to 75% or higher, 
proceed to No above. 

e If scorer disagrees, or correct answer can be debated but 
cannot be clearly resolved one way or the other, accept 
the percentage documented. Scorer comment K3 or N1, 
N2, or N3. 


Proceed to Step 7. 


7. Dates of Student Performance on the DSS 


Are dates recorded on the DSS for the baseline and final data points within the 2014-15 NYSAA 
administration period (September 29, 2014—February 27, 2015)? 


If NO > 
Consult 
the Table 
Leader 


One or both dates of student e Transcribe date(s) from VE to the DSS in red ink. 
performance is missing from Record: 

DSS, but can be determined e Scorer comment J 

from VE and dates are within 

the administration period. 


One or more dates of student Record: 

performance cannot be e “NS” for No Score for date(s) in question 

determined from VE or oneor | e Procedural Error comment 11a or 11b 

more dates on DSS are Review the remaining date(s); proceed to next Extension/AGLI, 
outside the administration or content area. 

period. 


8. Determining the validity of Verifying Evidence (VE) and Supporting Evidence (SE) 


a) Are the THREE required elements clearly documented on each piece of VE and match the 
information on the DSS? (Verifying Evidence and Supporting Evidence) 
Required elements may be handwritten or printed on the actual VE, on a VE label that is affixed to the 


VE, or a combination of both. A student may record his or her name and/or the date on work products. 
It is acceptable for only the student’s first name to be documented on the VE. 


If YES > 


If NO > 
Consult the 
Table 
Leader 


If NO > 
Consult the 
Table 
Leader 


Student name, date of student Proceed to Steps 8b, c, e, and/or f depending on 
performance and Level of Accuracy are type of evidence. 


clearly documented. 
One or more required elements on the VE e Adjust the required element(s) on the DSS in 


and/or VE label is discrepant with the red ink. 
DSS. Record: 
Note: Use chart below e Scorer comment K1, K2, or K3 


Proceed to Steps 8b, c, e, and/or f. 
Note: Do not make any marks on VE or VE labels 


The following ... 


Required elements documented by the teacher on the VE... The DSS and the VE label 

Required elements on the VE label.......................::::255 The DSS 

Teacher recorded information........................:::::005 Student recorded information on VE 

One or more required elements is missing | Record: 

from VE (VE itself or VE label) or VE label e “NS” for No Score for that date 

is not affixed to VE. e Procedural Error comment 12a, 12b, or 12¢ 
Review remaining date, or proceed to next 
Extension/AGLI, or content area. 


b) Student Work Product 


e Original, no photocopies of student responses, correction fluid/tape or white/black out. 
e Students may use assistive technology, computers, and/or interactive white board systems (e.g., 
SMART board) to complete the student work product. 


If YES > 


If NO > 
Consult 
the Table 
Leader 


Continue to review the other piece of VE submitted or proceed to Step 9. 


Work product is not original (i.e., Record: 

photocopies of student responses, e “NS” for No Score for that date 

correction fluid/tape, black out; e Record Procedural Error comment 13 

teacher erasures). Review remaining date, or proceed to next Extension/AGLI, 


or content area. 


c) Data Collection Sheet 
To be valid the DCS must have 


e aminimum of three dates of documented student performance, 

e one piece of SE for each date transcribed to the DSS as VE, 

e staff initials recorded for each date on the DCS, and 

e Supporting evidence may be an Observer Verification Form (OVF) OR another type of VE. 


NOTE: It is acceptable for DCS to be considered as VE for only the baseline, only the final, or both 
the baseline and final data points (and another type of VE submitted for the other data point). In all 
cases, the DCS must include at least 3 dates of data and meet all other requirements. 


lf YES > Continue with Step 8d below; review each submitted piece of SE individually. 


lf NO > SE is missing for any date 
Consult transcribed to DSS as VE. 
the Table 

Leader 


Fewer than three dates are 


documented on the DCS. 


Staff initials are missing from 
DCS for any date with student 
performance data. 


d) Supporting Evidence 


Record: 

e “NS” for No Score for the date(s) transcribed from the 
DCS to the DSS 

e Procedural Error comment 16c 

Review remaining date or proceed to next Extension/AGLI 

or content area. 

Record: 

e “NS” for No Score for all dates on the DCS 
transcribed to the DSS 

e Procedural Error comment 16a 

Review remaining date, or proceed to next Extension/AGLI, 

or content area. 

Record: 

e “NS” for No Score for all dates on the DCS 
transcribed to the DSS 

e Procedural Error comment 16b 

Review remaining date or proceed to next Extension/AGLI 

or content area. 


1. Student Work Product - Review Steps 8a and b to determine if student work product is valid SE. 
2. Photographs - Review Steps 8a and e to determine if photographs are valid SE. 
3. Digital video and/or audio clip - Review Steps 8a and f to determine if digital video and/or audio 


clip is valid SE. 


lf YES > Continue to review the other piece of SE submitted or proceed to Step 9. 


lf NO > Student work product, photographs, 


Consult or digital video or audio clip is 
the Table | invalid per Step criteria. 
Leader 


4. Observer Verification Form (OVF) 


Record: 

e “NS” for No Score for that date 

e Appropriate Procedural Error comment indicated in 
Steps 8a, b, e, orf 

Review the other piece of SE submitted for remaining date 

or proceed to next Extension/AGLI or content area. 


o Review Step 8a and OVF criteria below to determine if OVF is valid SE. 
o NOTE: Only a DCS requires SE. Ignore an OVF submitted in support of original student work, 
photographic, digital video, or audio evidence. 


Criteria for an OVF 
An OVF is invalid if: 
student name, date of student performance, and/or Level of Accuracy are missing; 
supplementary school personnel signed as the observer (e.g., teacher aide or teacher assistant); 
the person collecting the data (initials on DCS) also signed the OVF as the observer for that date (confirmed 


by compari 


ng initials and staff key information); 


more than one date of student performance is documented on a single OVF; 

the observer's signature and/or title is not included, cannot be confirmed; 

the observer's signature date is missing, or is not the same date the task was observed; or 
date of performance or Level of Accuracy on OVF is discrepant with DCS 


If NO> 
Consult the 
Table 
Leader 


e) Photographs 
Photographic 


OVF is invalid per one or more 
criteria listed in the bullets above. 


Observer's title is missing from 
OVF, but can be confirmed from 
another OVF in the datafolio. 


Observer's title is missing from 
OVF and_cannot be confirmed from 
another OVF in the datafolio. 


evidence must be 


Record: 

e “NS” for No Score for that date 

e Procedural Error comment 12a—c or 17a-e 

Review the other piece of SE submitted for remaining date 
or proceed to next Extension/AGLI or content area. 

Score the assessment. 

Record: 

e Record Scorer comment M 

Continue to review the other piece of SE submitted or 
proceed to Step 9. 

Record: 

e “NS” for No Score for that date 

e Procedural Error comment 17b 

Review the other piece of SE submitted for remaining date 
or proceed to next Extension/AGLI or content area. 


e aminimum sequence of three photographs of the student performing the task, 
e aminimum of one caption describing the sequence, and 
e the sequence must occur on a single date. 


If YES > 


If NO > 
Consult 
the Table 
Leader 


Continue to review the other piece of VE submitted or proceed to Step 9. 


Fewer than three photographs 
are submitted of the student 
performing the task. 


No caption is found. 


No date or multiple dates are 
found on the evidence. 


Record: 

e “NS” for No Score for that date 

e Procedural Error comment 14d 

Review remaining date, or proceed to next Extension/AGLI 
or content area. 

Record: 

e “NS” for No Score for that date 

e Procedural Error comment 14c 

Review remaining date or proceed to next Extension/AGLI 
or content area. 

Record: 

e “NS” for No Score for the date 

e Procedural Error comment 14a or 14b 

Review remaining date or proceed to next Extension/AGLI 
or content area. 


f) Digital Video/Audio Clip 


Video/Audio Clip must: 
e be 90 seconds or less (excluding markers) and 
e contain at least one recorded marker with the Student’s name, date of student performance, 
and Level of Accuracy. 


If YES > Continue to review the other piece of VE submitted or proceed to Step 9. 


lf NO > Clip duration is longer than 90 Record: 
Consult seconds and it is unreasonable to e “NS” for No Score for that date 
the Table review entire clip. e Procedural Error comment 15c 
Leader Review remaining date, or proceed to next Extension/AGLI, 
or content area. 
Student’s name, date of student Record: 
performance, and Level of e “NS” for No Score for that date 
Accuracy, are not recorded onthe | e Procedural Error comment 15a 
clip in any manner. Review remaining date or proceed to next Extension/AGLI 


Note: VE label on DVD/CD case or content area. 
or box is not acceptable. 


9. Were any supports provided that guided the student to the correct answer? 
Scorers must review each piece of VE to consider whether any documentation guided the student to the 
correct answer. Refer to the chart below for details. 


Actions That Result in an Administrative Error 
Templates or other formats are provided that give or lead the student to the answer. For 
example: 

e the verifying evidence is a sequencing worksheet that contains three boxes that 
state “First,” “Next,” “Last”; the student response choices are pictures that 
contain the words “First,” “Next,” “Last.” 

e the verifying evidence is a number line on which the student must provide 
missing numbers, but the correct number is provided as a shaded or dotted 
number in the spot and the student has to put a sticker of the number on the 
spot. 


If YES > If VE for baseline and/or final data | Record: 

Consult the point includes documentation that e “NS” for No Score for that date 

Table led the student to the answer. e Procedural Error comment 18a or 18b 

Review remaining date or proceed to next Extension/AGLI 
or content area. 

If NO > Continue to review and score the assessment. 


Leader 


10. Is the Level of Accuracy documented on DSS for the final data point calculated 
correctly based on VE? (Baseline Level of Accuracy was checked at Step 6c) 


If YES > 


lf NOT 
SURE > 
Consult the 
Table 
Leader 


If NO > 
Consult 
the Table 
Leader 


Record: 


e Percentage for the Level of Accuracy for both baseline and final dates 
e Yes or No for “was student prompted?” as documented on DSS for both baseline and final 
date. Note: do not verify prompts. Always record Yes/No for prompt, even if N/NS. 


Information on the VE contradicts 
or does not support what is 
documented for the Level of 
Accuracy and the Scorer cannot 
clearly see how to correct 
calculation. 


Level of Accuracy is missing from 
the DSS, but is present for a date 
that has valid VE. 


Level of Accuracy on the VE is 
discrepant with what is 
documented on the DSS. 


Level of Accuracy was incorrectly 
calculated and the Scorer can 
clearly see how the percentage 
calculated can be adjusted. 


Note: if Scorer cannot clearly 
see how to correct calculation, 
follow “If NOT SURE” 
directions. 


e If scorer disagrees, or correct answer can be debated 
but cannot be clearly resolved one way or the other, 
accept the percentage documented. 


Accept the percentages the teacher documented. 


e If Level of Accuracy recorded as a fraction (e.g., 1/4 
instead of 25%) accept and score 
Record: 


e Percentage for final documented by the teacher 
e Percentage for baseline 


e Yes/No for the question “was the student prompted?” 
as documented on DSS for both baseline and final 


e Scorer comment K3 or N1, N2, or N3 
Proceed to Step 11 


e Transcribe percentage calculation from the VE to the 
DSS in red ink. 
Record: 


e Percentage from the VE for final 
e Percentage for baseline 


e Yes/No for the question “was the student prompted?” 
as documented on DSS 


e Scorer comment D5 
e Adjust the percentage calculation on the DSS in red ink 


to match the VE. 
Record: 


e Adjusted percentage for final 
e Percentage for baseline 


e Yes/No for the question “was the student prompted?” 
as documented on DSS 


e Scorer comment K3 
Note: Never make changes to VE or VE labels. 
e If error in calculation is clear 

For example: 


Math example: “4 + 2 = 6” is marked incorrect by the 
teacher, but is clearly correct. 


e Adjust the percentage calculation on DSS in red ink. 
Record: 


e Adjusted percentage for final 
e Percentage for baseline 


e Yes/No for the question “was the student prompted?” 
as documented on DSS 


e Scorer comment O 


11. Score the next Extension/AGLI 
Follow Steps 3-10 for the next Extension/AGLI from the same content area. 


12. Score Mathematics, Science, and Social Studies 
Follow Steps 2—11 and score the remaining content areas in order for the grade assessed: 
mathematics (Grades 3-8 & HS), science (Grades 4, 8, & HS), and social studies (HS only). 


13. Confirm your Scorer Worksheet is complete and accurate for each Extension/AGLI within 
each content area, including Procedural Error Comments, if applicable, and Scorer 
Comments 

e Extension/AGLI Code, Three Connections Questions—Double check that a five digit 
Extension/AGLI code has been recorded or 00099 (if applicable); that the three Connections 
questions are bubbled in as “Y” or “N”; and percentages for Levels of Accuracy are recorded for both 
baseline and final data points. 


e Confirm Extension/AGLIs have been recorded correctly—each code is documented in 
the correct space. 

e Confirm Baseline Level of Accuracy is 74% or below or is an “NS” for No Score, if 
applicable 


e Procedural Error Comments (1—20) — Double check that a Procedural Error Comment has been 
recorded on the Scorer Worksheet for each No or No Score. 

e Scorer Comments (A-O)/Positive Feedback Comments (P-W) — Select comments from the back 
of the Scorer Worksheet that will clarify if something was adjusted in the datafolio and/or if something 
was questioned during scoring. Scorers are encouraged to also provide positive feedback to 
teachers. 


e No blank spaces unless the content was not assessed. 


14. Complete the Scannable Score Document 


Transcribe the following data: 


From the: e Extension/AGLI code — 5 digits or 00099 (if applicable) 
Scorer e Three Connections questions — “Y” for Yes or “N” for No 
Worksheet o Extension/AGLI from grade level 


o Task connects to Extension/AGLI 
o VE connects to task 
e Percentages or “NS”, if applicable — Level of Accuracy for baseline and final 


e Yes/No for “was student prompted?” — even if N or NS for Extension/AGLI 


From the: e Absent 
“Not Tested” e Administrative Error 
form, if e Not Enrolled 
applicable e Took Another Assessment 


e Medically Excused 


Confirm you have completed: 


From the: e Was a Collegial Review of this datafolio conducted? “Y” for Yes or “N” for No 


Student Page |. Transcribe the Testing Accommodations documented on page 2 of the 


Student Page to the Scannable Score Document in the space provided. 


e Complete the Scannable Score Document for each applicable content area and for any other 
information as directed by the SSC. 


Errors in transcribing Connection to Grade Level Content and Performance percentages from 
the Scorer Worksheet to the Scannable Score Document will directly impact the student from 
CAUTION receiving a reportable score. 
PLEASE DOUBLE CHECK ALL TRANSCRIPTIONS 

TO THE SCANNABLE SCORE DOCUMENT! 


CONSULT 


TABLE 
LEADER 


This table outlines other issues that may come up when scoring a datafolio. These may result in a No 
Score and/or adjustment to the datafolio. If any of these issues are found, consult the Table Leader 
for direction. 


The following may or may not result in a No or No Score. 


Incorrect or teacher-created NYSAA forms were used (e.g., Data Summary Sheet (DSS) 
for the wrong grade, Student Page, or Data Collection Sheet not from NYSAA Administration 
Manual). 


Student Page is missing 


Photocopies (either in part or whole) or correction fluid/tape or black out is found on 
assessment documents. 


“By” statement from Assessment Task not demonstrated on VE. 
Task assessed on the baseline and final data point is different. 


Evidence is found that a mistake in data collection was erased on the DSS, VE, or 
supporting evidence and was not crossed out and initialed by the teacher. 


VE for ELA is submitted in a language other than English. 


Evidence is found that a mistake in documentation by the teacher was erased on the DSS, 
VE, or supporting evidence and was not crossed out, corrected, and initialed. 


VE or supporting evidence clearly appears to be homework. 
A multi-step DCS includes only a single step or a single time segment is documented on DCS. 


Assessment Task (AT) code on VE/VE label does not match Assessment Task code on DCS. 


The following may occur in a datafolio and are acceptable, providing they meet requirements. 


Presentation or number of items is different between the baseline and final data point. 


Chart or calendar is submitted for a date other than the last date recorded on the chart or 
calendar. 


Verifying evidence includes items, questions, steps not relevant to the assessed task. 


Extra VE or supporting evidence was submitted beyond the requirements for a specific 
Extension/AGLI. 


Extra DSSs are found. 


Dates or information printed in the header and/or footer of documents completed with 
Measured Progress ProFile™ contradicts information recorded on the evidence or VE label. 


The year portion of the date in the documentation is discrepant or missing (DSS, VE, 
supporting evidence). 


APPENDIX C—2014-15 SCORING 
DECISION RULES 


2014-15 NYSAA Technical Report: Appendix C—2013-14 Scoring Decision Rules 


Decision Rules for Scoring NYSAA Datafolios 


(For Table Leaders) 


© Scori May 
COlipasds : Decision Rule/Rationale aire 
> Concern/Question in 

ow Step(s) 

1 Incorrect or teacher-created Incorrect Forms 1-8 
NYSAA forms were used (e.g., e [fan incorrect Student Page or DSS is used but all assessment 
Data Summary Sheet (DSS) for requirements can be confirmed, score the assessment following the 
the wrong grade, Student Page Scoring Procedures. 
from 2013-14, or Data Collection e fan incorrect DSS is used and assessment requirements cannot be 
Sheet (DCS) not from NYSAA confirmed, record Extension/Alternate Grade Level Indicator (AGLI) 
Administration Manual). code(s) 00099, “N” for No for each Connection question and “NS” for No 

Score for baseline and final of the Extension(s)/AGLI(s). Record 
Procedural Error comment 19. Continue to next Extension/AGLI or content 
area. 

Teacher-created Forms 

e Teacher created his/her own 2014—15 forms, such as a DCS or Verifying 
Evidence (VE) label. If all requirements are clearly documented, score the 
assessment following the Scoring Procedures. 

2 Student Page is missing If the student demographic information (student name, date of birth) on the q 

DSS can be used to confirm that the correct student was assessed, continue 
to review and score the datafolio. Direct the Scorer to record comment A. 

3 | Photocopies (either in part or e Correction fluid/tape or black out found on page numbers, Student Page, or | 1-8 
whole) or correction fluid/tape or table of contents does not directly impact scores. Score the assessment 
black out is found on assessment following the Scoring Procedures. 
documents. e Photocopies of the DSS, VE, or supporting evidence (either in part or in 

whole) or correction fluid/tape or black out found on information will directly 
impact student scores. 
o If DSS, record “NS” for No Score for baseline and final dates. 
o If VE, record “NS” for No Score for that date. 
o Record Procedural Error Comment 13 
Note: Digital photo prints in color or in black and white, computer/tablet device 
printouts, and interactive white board (e.g., SMART board) printouts are 
acceptable, since they are not photocopies. 

Assessment Task (AT) 

4 | Assessment Task includes a “by” | If any part of the Assessment Task, including a “by” statement, is not 6 
statement that is not demonstrated in the Verifying Evidence and there is no teacher notation to 
demonstrated in the VE. clarify, record “N” for No to “VE connects to Task” and “NS” for No Score 

for baseline and final of the Extension/AGLI. Record Procedural Error 
comment 8e. Continue to the next Extension or AGLI (e.g., grade 4 science 
AT42211B, “the student will distinguish between a plant and an animal by 
sorting a group of pictures into categories. ) 

The exception to this rule is when the teacher has indicated a method of 
response and the student demonstrates a different method of response. This 
type of “by” statement is not related to the Assessment Task and a different 
method of response would be acceptable. 

5 | The task assessed on the The same task must be assessed on the baseline and final. If the task 5-6 
baseline and final data points is assessed on the baseline administration is different from the final 
different. administration, record “N” for No to “VE connects to Task” and “NS” for No 

Score for baseline and final dates. Record Procedural Error comment 10. 

Verifying Evidence (VE) 

6 | Presentation or number of items Best practice for administration was to provide similar format and presentation | 6 
differs between the baseline and | across the baseline and final administration. For scoring 2014-15, as long as 
final VE. the connection of “VE to task” is clearly demonstrated, any change in format, 

presentation, number of items (increase or decrease), etc., should be 
accepted. Continue to review and score the assessment. Direct the Scorer to 
record comment F. 

7 | VE for ELA is submitted in a Record “NS” for No Score for that date. Record Procedural Error comment 8b-f 
language other than English. 21. Continue to score next date. 
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10 


11 


12 


13 


14 


Evidence is found that a mistake 
in documentation by the 
teacher was erased on the DSS, 
VE, or supporting evidence and 
was not crossed out, corrected, 
and initialed. 


VE or supporting evidence 
clearly appears to be homework. 


Chart or calendar is submitted for 
a date other than the last date 
recorded on the chart or 
calendar. 

Verifying evidence includes items, 
questions, steps not relevant to 
the assessment task. 


A multi-step DCS includes a 
single-step task, or single time 
segment is documented on a 
DCS. 


Entry includes extra Data 
Summary Sheet(s) (i.e., separate 
DSS for each piece of VE). 


Assessment Task (AT) code on 
VE/VE label does not match the 
Assessment Task code on the 
DSS. 


Dates 


15 


16 


Dates or information printed in 
header and/or footer of 
documents completed with 
Measured Progress ProFile™ or 
other web-based program 
contradict information recorded on 
the evidence or VE label. 

The year portion of the date in the 
documentation is discrepant or 
missing (DSS, VE, SE). 


A student may self-correct on a student work product, which does not 

require a notation by the teacher. 

e lf ateacher-made mistake is crossed out and corrected but not initialed, 
score the assessment following the Scoring Procedures. 

e If the mistake was crossed out but not corrected or not initialed, record 
“NS” for No Score for that date. Record Procedural Error comment 13. 

e If a teacher-made erasure is confirmed, record “NS” for No Score for that 
date. Record Procedural Error comment 13. Continue to review and score 
other date for the Extension/AGLI following the Scoring Procedures. 

Note: Documentation made by the teacher does not have to be in permanent 

ink. 

e lf the Student Page indicates special education programs and services at 

home, in a hospital, or other facility, accept what is documented by the 

teacher and score the assessment following the Scoring Procedures. 


e If the Student Page does not indicate special education programs and 
services at home, in a hospital, or other facility, record “NS” for No Score 
for that date. Record Procedural Error comment 20. Continue to score next 
date. 

A chart or calendar can be submitted for only a single date. If the date on the 

calendar or chart is within the administration period for the 2014-15 NYSAA, 

accept the calendar or chart as evidence for that date. Score the assessment 
following the Scoring Procedures. 


e Required elements are present, all requirements for the type of VE are met, 
and there is no obvious error in documentation. Accept what is documented 
by the teacher, do not recalculate the Level of Accuracy, and score the 
assessment following the scoring procedures. 


e If all requirements are NOT met: 

o Student work product: record “NS” for No Score for that date. Record 
Procedural Error comment 12. 

o Data Collection Sheet: record “NS” for No Score for that date. Record 
Procedural Error comment 12, 16, or 17. 

o Photographic, digital video, or audio: record “NS” for No Score for 
that date. Record Procedural Error comment 12, 14, or 15. Continue 
to score the next date. 


e All of the requirements for VE are met, the additional requirements for a 
DCS are met, and there is no obvious error in documentation. Score as 
documented on the DCS following the Scoring Procedures. Direct the 
Scorer to record comment L. 

e lf asingle-step task is documented on a multi-step DCS, score the 
assessment following the Scoring Procedures. 

e lfasingle time segment is documented on a DCS, score the assessment 
following the Scoring Procedures. 

If the same task assessed on the baseline and final dates is the same, adjust 

the most complete DSS with date, Level of Accuracy, and whether prompted 

from the other DSS (confirmed by VE) and continue to review and score the 

content area. Direct the Scorer to record comment D. 


e The Assessment Task code is not a required element, if the VE connects to 
the task as documented on the DSS, score the assessment following the 
Scoring Procedures. 

e lf the VE connects to the task documented on the VE, but not the task 
documented on the DSS, record “N” for No to “VE connects to Task” and 
“NS” for No Score for both the baseline and final dates. Record 
Procedural Error comment 8c. 


Information printed in the header and/or footer of a document completed using 
the Measured Progress ProFile™ software or other web-based program (e.g., 
News-2-You©) cannot be considered when reviewing documentation of 
student performance data. Score the assessment following the Scoring 
Procedures. 


When the year in a date is discrepant (e.g., Oct. 15, 2015, or 2/4/14) or 
missing (e.g., Oct. 15, or 2/4), but the month and day are within the acceptable 
parameters for the current assessment, it is considered a clerical error. Score 
the assessment following the Scoring Procedures. 


8a-f 


7or 
8b or 8d 


8b-f or 
10 
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APPENDIX D—SUBGROUP RELIABILITY 


2014-15 NYSAA Technical Report: Appendix D—Subgroup Reliability -1- 


Table D-1. 2014-15 NYSAA: Subgroup Reliabilities— 
English Language Arts 


Number Total Scaled Score 
eas OlQUp Sia Minimum Maximum Mean ante AlpNe SEM 
All Students 2,745 400 525 467.50 24.61 0.87 8.79 
Male 1,902 400 525 468.71 24.37 0.87 8.84 
Female 843 400 525 464.76 24.94 0.88 8.65 
American 
indian/Alaskan Naaie 23 400 525 469.87 33.33 0.93 8.60 
3 Black 682 400 525 470.17 23.99 0.87 8.70 
Asian 190 400 525 468.67 27.56 0.89 8.95 
Hispanic 826 400 525 470.72 25.28 0.88 8.77 
White 972 400 525 462.85 23.03 0.85 8.82 
eae 11 400 525 475.00 25.03 0.92 7.22 
Multi 41 400 525 459.59 19.81 0.81 8.74 
All Students 2,967 400 525 466.36 21.61 0.83 8.85 
Male 2,040 400 525 466.86 21.50 0.83 8.79 
Female 927 400 525 465.27 21.80 0.83 8.97 
American 
IMCanDAlaslehinuee 23 400 525 467.96 22.85 0.84 9.20 
4 Black 733 400 525 468.63 20.84 0.82 8.73 
Asian 175 400 525 465.66 20.62 0.82 8.65 
Hispanic 803 400 525 469.16 22.30 0.85 8.54 
White 1,173 400 525 463.24 21.46 0.82 9.07 
Native Hawaiian/Other 19 400 525 458.00 1814 0.81 7.89 
Pacific Islander 
Multi 41 400 525 466.32 18.45 0.59 11.85 
All Students 3,127 400 525 465.62 20.03 0.82 8.40 
Male 2,153 400 525 466.08 19.90 0.82 8.46 
Female 974 400 525 464.61 20.29 0.83 8.28 
American 
india AIAlAsean Nadu 29 400 525 455.72 19.40 0.80 8.77 
5 Black 783 400 525 468.84 17.81 0.79 8.25 
Asian 157 400 525 461.55 22.20 0.85 8.63 
Hispanic 906 400 525 467.92 20.05 0.83 8.32 
White 1,211 400 525 462.51 20.54 0.83 8.55 
Native Hawaiian/Other 12 400 525 469.67 1936 0.85 7.54 
Pacific Islander 
Multi 29 400 525 467.00 18.73 0.82 7.88 
All Students 3,212 400 525 468.10 22.30 0.86 8.23 
Male 2,169 400 525 467.92 22.23 0.87 8.13 
Female 1,043 400 525 468.47 22.46 0.86 8.44 
American 
indian/Alaskan Nauve 36 400 525 472.81 23.34 0.86 8.83 
6 Black 791 400 525 471.37 21.03 0.85 8.25 
Asian 172 400 525 468.09 22.12 0.88 7.74 
Hispanic 870 400 525 470.23 22.88 0.88 7.91 
White 1,301 400 525 464.47 22.30 0.86 8.45 
Native Hawaiian/Other 9 400 525 
Pacific Islander 
continued 
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Number Total Scaled Score 


grate GtOuP eee Minimum Maximum Mean aie Dna OEM 
6 Multi 33 400 525 471.45 19.00 0.72 10.04 
All Students 3,361 400 525 463.28 24.72 0.84 9.87 
Male 2,343 400 525 463.58 24.57 0.84 9.86 
Female 1,018 400 525 462.60 25.04 0.84 9.90 
American 
iadianiAlaskan Naive 28 400 525 462.93 29.23 0.90 9.11 
7 Black 816 400 525 467.83 25.50 0.85 10.00 
Asian 164 400 525 456.95 24.81 0.87 8.86 
Hispanic 865 400 525 467.12 24.34 0.82 10.25 
White 1,438 400 525 459.04 23.54 0.83 9.73 
Native Hawaiian/Other 13 400 525 468.54 2134 0.81 9.25 
Pacific Islander 
Multi 37 400 525 464.76 25.28 0.85 9.88 
All Students 3,398 400 525 460.14 24.97 0.89 8.39 
Male 2,281 400 525 459.85 24.94 0.89 8.37 
Female 1,117 400 525 460.74 25.02 0.89 8.43 
American 
dia niAlaSian Naive 24 400 525 463.17 16.32 0.69 9.09 
9 Black 867 400 525 462.57 25.33 0.89 8.58 
Asian 177 400 525 457.86 25.68 0.90 8.08 
Hispanic 910 400 525 464.70 26.70 0.91 8.20 
White 1,382 400 525 455.90 22.92 0.86 8.42 
Native Hawaiian/Other 11 400 525 456.18 9.79 0.66 5.74 
Pacific Islander 
Multi 27 400 525 459.00 23.79 0.89 7.98 
All Students 2,900 400 525 467.22 26.20 0.91 8.05 
Male 1,885 400 525 467.70 26.15 0.90 8.09 
Female 1,015 400 525 466.31 26.29 0.91 7.98 
American 
iadianialasian Naave 27 400 525 474.04 31.15 0.92 8.94 
High Black 735 400 525 469.37 26.06 0.90 8.34 
School Asian 141 400 525 468.16 30.65 0.93 7.91 
Hispanic 662 400 525 473.32 27.86 0.91 8.15 
White 1,302 400 525 462.64 23.89 0.89 7.84 
Native Hawaiian/Other 10 400 525 484.40 3161 0.94 7.41 
Pacific Islander 
Multi 23 400 525 460.13 20.96 0.85 8.01 
Table D-2. 2014-15 NYSAA: Subgroup Reliabilities— 
Mathematics 
Number Total Scaled Score 
iets GieHp eas Minimum Maximum Mean Beate te Be a 
All Students 2,746 400 525 467.41 24.70 0.88 8.45 
Male 1,902 400 525 468.63 24.55 0.88 8.51 
3 Female 844 400 525 464.66 24.85 0.89 8.31 
American Indian/Alaskan 23 400 525 462.04 28.73 0.91 862 
Native 
continued 
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Number Total Scaled Score 
Grade Group of 3 . Standard Alpha SEM 
Students Minimum Maximum Mean aviation 
Black 684 400 525 470.28 23.95 087 847 
Asian 192 400 525 466.72 27.17 0.90 8.58 
Hispanic 825 400 525 471.19 2492 088 853 
3 White 970 400 525 462.65 23.69 0.88 8.30 
Native Hawaiian/Other Pacific 11 400 525 474.64 2695 0.92 7.71 
Islander 
Multi Al 400 525 460.41 2164 087 7.76 
All Students 2,969 400 525 464.60 2627 0.87 9.52 
Male 2,039 400 525 465.62 2654 087 9.58 
Female 930 400 525 462.35 2556 086 9.41 
ght Indian/Alaskan 23 400 525 466.00 25.79 0.86 9.58 
4 Black 733 400 525 469.38 25.68 0.86 9.66 
Asian 175 400 525 465.82 2554 0.86 9.43 
Hispanic 806 400 525 468.45 27.66 0.89 9.35 
White 1,173 400 525 458.96 2491 085 9.54 
Native Hawaiian/Other Pacific 18 400 525 AG1.56 25.58 0.85 9.86 
Islander 
Multi Al 400 525 459.76 2026 0.76 9.93 
All Students 3,129 400 525 461.82 2345 086 867 
Male 2,152 400 525 462.70 2333 086 874 
Female 977 400 525 459.88 2361 0.87 8.49 
Pe Indian/Alaskan 29 400 525 454.03 23.76 0.88 8.39 
5 Black 782 400 525 464.03 21.78 0.84 8.60 
Asian 156 400 525 459.28 2418 0.88 8.39 
Hispanic 907 400 525 465.83 2432 087 881 
White 4018 400 525 457.87 23.08 0.86 863 
Native Hawaiian/Other Pacific 12 400 525 464.42 23.06 0.85 8.97 
Islander 
Multi 30 400 525 462.43 2334 088 8.22 
All Students 3,213 400 525 464.37. 2550 0.90 7.86 
Male 2,169 400 525 464.57. 25.71 0.91 7.74 
Female 1,044 400 525 463.94 25.06 0.89 813 
ee Indian/Alaskan 36 400 525 471.47 26.06 0.90 8.05 
6 Black 792 400 525 468.03 25.03 0.90 7.92 
Asian 172 400 525 462.46 2565 0.90 827 
Hispanic 870 400 525 467.31 2624 0.90 813 
White 1,301 400 525 460.11 24.70 0.91 7.57 
Native Hawaiian/Other Pacific 9 400 525 
Islander 
Multi 33 400 525 468.85 22.65 0.84 9.00 
All Students 3,364 400 525 466.82 2547 0.84 10.19 
Male 2,344 400 525 467.13 2531 084 1014 
Female 1,020 400 525 466.12 25.85 0.84 10.28 
i ee Indian/Alaskan 28 400 525 468.29 27.09 0.88 9.43 
Black 818 400 525 471.16 2614 0.85 10.15 
Asian 164 400 525 463.18 25.87 0.85 9.97 
continued 
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Number Total Scaled Score 


aiate Clone eee Minimum Maximum Mean aria Apne OEM 
Hispanic 864 400 525 470.10 25.23 0.83 10.30 
White 1,441 400 525 462.77 24.46 0.82 10.24 
i ae cans Pie Reciic 13 400 525 472.23 2497 088 882 
Multi 36 400 525 465.19 26.94 0.86 10.10 
All Students 3,392 400 525 457.68 25.00 0.89 8.46 
Male 2,280 400 525 457.42 25.04 0.89 8.43 
Female 1,112 400 525 458.22 24.93 0.88 8.53 
American Indian/Alaskan Native 24 400 525 459.58 17.22 0.72 9.19 
Black 865 400 525 459.63 25.26 0.88 8.76 
8 Asian 177 400 525 454.83 27.38 0.92 7.97 
Hispanic 909 400 525 462.57 26.53 0.89 8.72 
White 1,379 400 525 453.53 22.94 0.87 8.15 
Hate: HawalianOmey -Paeiie 11 400 525 457.55 1240 0.62 7.66 
Islander 
Multi 27 400 525 459.96 23.38 0.89 7.74 
All Students 2,903 400 525 470.14 28.28 0.87 10.28 
Male 1,885 400 525 470.98 28.18 0.86 10.40 
Female 1,018 400 525 468.57 28.40 0.87 10.04 
American Indian/Alaskan Native 27 400 525 473.00 32.40 0.88 11.44 
High Black 735 400 525 471.44 28.70 0.87 10.19 
School ‘sian 141 400 525 472.01 30.97 0.89 10.07 
Hispanic 660 400 525 475.63 29.90 0.89 9.95 
White 1,307 400 525 466.40 26.24 0.84 10.46 
Native Hawaiian/Other Pacific 10 400 525 485.50 30.28 0.93 8.23 
Islander 
Multi 23 400 525 461.74 25.16 0.82 10.67 
Table D-3. 2014-15 NYSAA: Subgroup Reliabilities— 
Science 
NumBOOr Total Scaled Score 
Glade eH? Students Minimum Maximum Mean pinged URN eM 
Deviation 
All Students 2,953 550 600 576.50 10.54 0.74 5.38 
Male 2,030 550 600 576.83 10.43 0.73 5.45 
Female 923 550 600 575.76 10.76 0.76 5.24 
pela cian laeken 23 550 600 575.61 11.22 086 4.18 
4 Black 729 550 600 578.01 10.34 0.71 5.56 
Asian 174 550 600 576.50 10.80 0.80 4.79 
Hispanic 803 550 600 577.19 10.81 0.77 5.20 
White 1,165 550 600 575.07 10.32 0.72 5.46 
Native Hawaiian/Other 18 550 600 57289 999 0.73 5.16 
Pacific Islander 
Multi 41 550 600 578.88 8.59 0.47 6.23 
3 All Students 3,382 550 600 578.83 10.48 0.70 5.75 
Male 2,273 550 600 578.67 10.64 0.71 5.72 


continued 
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Total Scaled Score 


Number of 
eens (up Students Minimum Maximum Mean Siendald Minne SEM 
Deviation 
Female 1,109 550 600 579.14 10.15 0.67 5.82 
satis Indian/Alaskan 24 550 600 580.33 11.58 0.77 5.58 
Black 864 550 600 579.34 10.19 0.71 5.52 
8 Asian 177 550 600 578.50 10.72 0.72 5.72 
Hispanic 902 550 600 580.36 10.70 0.68 6.05 
White 1,377 550 600 577.55 10.36 0.70 5.71 
Nallve Hewalano met 11 550 600 575.73 3.23 -1.79 5.39 
Pacific Islander 
Multi 27 550 600 577.96 10.69 0.61 6.70 
All Students 2,896 550 600 580.43 11.59 0.77 5.59 
Male 1,880 550 600 580.54 11.47 0.77 5.54 
Female 1,016 550 600 580.23 11.81 0.77 5.68 
re aaniAekan 7 550 600 581.37 12.80 0.83 5.20 
High Black 731 550 600 581.26 11.20 0.75 5.65 
School Asian 141 550 600 580.08 13.41 0.82 5.65 
Hispanic 659 550 600 582.14 11.56 0.76 5.69 
White 1,305 550 600 579.08 11.46 0.77 5.51 
Native Hawailan/Other 10 550 600 586.20 13.00 0.93 3.54 
Pacific Islander 
Multi 23 550 600 580.91 9.87 0.63 6.02 
Table D-4. 2014-15 NYSAA: Subgroup Reliabilities— 
Social Studies 
Number Total Scaled Score 
Grade Group of be ; Standard Alpha SEM 
Giuedenis Minimum Maximum Mean Deviation 
All Students 2,886 550 600 580.17 10.99 0.80 4.89 
Male 1,875 550 600 580.42 10.83 0.80 4.90 
Female 1,011 550 600 579.71 11.26 0.81 4.87 
eee indianinaskan 26 550 600 583.38 1134 0.81 4.99 
High Black 731 550 600 581.27 10.87 0.80 4.90 
School Asian 141 550 600 580.17 12.12 0.82 5.08 
Hispanic 657 550 600 582.10 11.42 0.83 4.77 
White 1,298 550 600 578.52 10.46 0.78 4.95 
Native Hawatian/Other 10 550 600 585.30 12.08 0.86 4.56 
Pacific Islander 
Multi 23 550 600 577.39 9.38 0.80 4.25 
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Table E-1. 2014-15 NYSAA: Interrater Consistency Statistics for S/LOC Items — 


English Language Arts Grade 3 


pIUDEn OL Percent ; Mean Standard 
Standard LOC Exact Responses fy a4, Correlation Absolute A aviation 
Matches — Scored Twice Difference 

1 60 0 100.00 1.00 0.00 
314 2 397 3 99.25 1.00 2.17 
3 46 0 100.00 1.00 0.00 
1 367 7 98.13 0.97 4.29 
322 2 113 3 97.41 0.96 2.90 
3 20 2 90.91 0.74 2.90 
1 269 4 98.53 1.00 0.58 
331 2 160 1 99.38 0.98 4.00 
3 85 1 98.84 0.99 2.50 
1 111 1 99.11 0.98 6.70 
341 2 299 3 99.01 0.94 8.33 
3 92 0 100.00 1.00 0.00 
1 248 7 97.25 1.00 0.69 
351 2 190 3 98.45 1.00 1.53 
3 66 2 97.06 0.97 2.90 


SD is blank when the number of responses scored twice is less than 10. 


Table E-2. 2014-15 NYSAA: Interrater Consistency Statistics for SILOC Items — 


English Language Arts Grade 4 


Mean 


Standard LOC Exact Responses speed Correlation Absolute ans 
Matches Scored Twice ae Difference Saal 
1 317 4 98.75 1.00 0.93 
A411 2 264 5 98.14 0.99 1.26 
3 18 0 100.00 1.00 0.00 
1 262 5 98.13 0.96 3.66 
413 2 237 0 100.00 1.00 0.00 
3 78 2 97.50 0.90 5.15 
1 401 5 98.77 1.00 1.18 
432 2 114 2 98.28 0.93 6.25 
3 70 2 97.22 0.93 3.75 
1 127 3 97.69 0.99 2.10 
442 2 431 5 98.85 0.99 1.82 
3 22 0 100.00 1.00 0.00 
1 89 2 97.80 0.94 5.25 
453 2 431 14 96.85 0.99 1.22 0.77 
3 61 0 100.00 1.00 0.00 
SD is blank when the number of responses scored twice is less than 10. 
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Table E-3. 2014-15 NYSAA: Interrater Consistency Statistics for SILOC Items — 


English Language Arts Grade 5 


Mean 


Standard LOC Exact Responses seas Correlation Absolute Pein 
Matches Scored Twice Difference 
1 351 4 98.87 1.00 1.80 
512 2 208 4 98.11 0.97 2.95 
3 42 2 95.45 0.92 3.75 
1 222 4 98.23 0.96 4.58 
523 2 361 9 97.57 0.96 2.49 
3 18 0 100.00 1.00 0.00 
1 138 4 97.18 0.97 4.18 
533 2 429 6 98.62 0.98 2.50 
3 25 1 96.15 0.62 4.00 
a 176 1 99.44 1.00 2.00 
541 2 382 6 98.45 0.99 2.02 
3 49 1 98.00 0.97 2.50 
1 283 2 99.30 1.00 1.75 
552 2 268 5 98.17 0.98 2.32 
3 49 0 100.00 1.00 0.00 


SD is blank when the number of responses scored twice is less than 10. 


Table E-4, 2014-15 NYSAA: Interrater Consistency Statistics for SILOC Items — 


English Language Arts Grade 6 


Percent : Mean Standard 
Standard LOC Exact Responses Exact Correlation Abs olute Deviation 
Matches Scored Twice Difference 

1 216 3 98.63 0.98 4.03 

611 2 316 3 99.06 0.99 2.37 

3 59 2 96.72 0.80 5.00 

1 231 3 98.72 0.99 2.80 

621 2 331 5 98.51 0.99 2.20 

3 32 0 100.00 1.00 0.00 

1 113 5 95.76 0.93 3.34 

631 2 462 9 98.09 0.99 1.64 

3 12 0 100.00 1.00 0.00 

1 278 5 98.23 0.99 2.04 

641 2 279 3 98.94 0.98 2.40 

3 26 ul 96.30 0.21 10.00 

1 219 2 99.10 1.00 1.30 

651 2 274 6 97.86 0.99 1.32 

3 97 1 98.98 1.00 1.00 

SD is blank when the number of responses scored twice is less than 10. 
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Table E-5. 2014-15 NYSAA: Interrater Consistency Statistics for SILOC Items — 


English Language Arts Grade 7 


Percent : ee Standard 
Standard LOC Exact Responses fy a4, Correlation Absolute aviation 
Matches Scored Twice Difference 

1 437 5 98.87 0.99 2.72 
713 2 117 2 98.32 0.84 7.50 
3 67 1 98.53 0.97 3.30 
1 366 6 98.39 0.96 4.18 
724 2 120 1 99.17 0.98 3.30 
3 143 0 100.00 1.00 0.00 
1 390 5 98.73 0.99 2.84 
732 2 187 4 97.91 0.88 6.00 
3 38 0 100.00 1.00 0.00 
1 221 2 99.10 0.99 2.55 
741 2 361 5 98.63 0.98 2.62 
3 42 1 97.67 0.79 10.00 
1 315 4 98.75 0.97 3.60 
753 2 55 0 100.00 1.00 0.00 
3 260 6 97.74 0.97 2.05 


SD is blank when the number of responses scored twice is less than 10. 


Table E-6. 2014-15 NYSAA: Interrater Consistency Statistics for SILOC Items — 


English Language Arts Grade 8 


Mean 


Standard LOC Exact Responses ata Correlation Absolute aide 
Matches Scored Twice Difference 

1 381 5 98.70 0.99 2.76 

822 2 201 1 99.50 1.00 2.50 

3 50 1 98.04 0.90 5.90 

4 407 5 98.79 0.97 5.10 

823 2 131 2 98.50 0.97 4.15 

3 80 2 97.56 0.96 3.50 

1 332 4 98.81 0.97 5.45 

833 2 247 2 99.20 1.00 1.70 

3 48 0 100.00 1.00 0.00 

1 288 4 98.63 0.99 2.23 

842 2 260 1 99.62 1.00 2.00 

3 63 0 100.00 1.00 0.00 

1 364 2 99.45 1.00 0.90 

852 2 170 7 96.05 0.95 2.74 

3 82 0 100.00 1.00 0.00 

SD is blank when the number of responses scored twice is less than 10. 
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Table E-7. 2014-15 NYSAA: Interrater Consistency Statistics for SILOC Items — 


English Language Arts High School 


Percent F Mee Standard 
Standard LOC Exact Responses fy, Correlation Absolute A aviation 
Matches Scored Twice Difference 

uf 127 0 100.00 1.00 0.00 
911 2 331 2 99.40 1.00 0.10 
3 98 3 97.03 0.87 6.33 
1 251 5 98.05 0.98 2.52 
921 2 243 3 98.78 0.98 3.43 
3 76 0 100.00 1.00 0.00 
1 154 3 98.09 0.95 3.90 
931 2 342 5 98.56 0.99 1.80 
3 74 0 100.00 1.00 0.00 
1 374 7 98.16 0.98 3.21 
942 2 88 0 100.00 1.00 0.00 
3 87 3 96.67 0.96 2.20 
1 204 1 99.51 1.00 1.00 
951 2 241 6 97.57 0.99 1.25 
3 118 4 96.72 0.97 1.90 


SD is blank when the number of responses scored twice is less than 10. 


Table E-8. 2014-15 NYSAA: Interrater Consistency Statistics for SILOC Items — 


Mathematics Grade 3 


Mean 


Standard LOC Exact Responses ees Correlation Absolute estite 
Matches Scored Twice Difference 

1 331 3 99.10 1.00 2.67 

301 2 101 2 98.06 0.99 1.80 

3 84 1 98.82 0.82 10.00 

1 162 1 99.39 1.00 0.30 

302 2 239 4 98.35 0.94 4.00 

3 111 0 100.00 1.00 0.00 

1 284 0 100.00 1.00 0.00 

303 2 213 5 97.71 0.99 1.32 

3 11 0 100.00 1.00 0.00 

1 134 1 99.26 1.00 2.00 

304 2 355 4 98.89 0.99 2.38 

3 10 0 100.00 1.00 0.00 

1 104 4 96.30 0.99 1.63 

305 2 283 3 98.95 1.00 1.07 

3 96 0 100.00 1.00 0.00 

SD is blank when the number of responses scored twice is less than 10. 
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Table E-9. 2014-15 NYSAA: Interrater Consistency Statistics for SILOC Items — 


Mathematics Grade 4 


Percent : ee Standard 
Standard LOC Exact Responses Fxact Correlation Absolute Deviation 
Matches Scored Twice Difference 
1 291 5 98.31 0.99 2.06 
401 2 194 4 97.98 0.93 4.93 
3 104 0 100.00 1.00 0.00 
1 323 2 99.38 1.00 1.25 
402 2 58 1 98.31 1.00 1.20 
3 209 5 97.66 0.91 3.72 
1 422 6 98.60 0.99 2.55 
403 2 155 2 98.73 1.00 0.70 
3 17 0 100.00 1.00 0.00 
1 322 2 99.38 1.00 2.90 
404 2 216 5 97.74 0.99 1.60 
3 49 2 96.08 0.93 2.90 
1 153 2 98.71 1.00 0.75 
405 2 290 12 96.03 0.94 2.14 2.71 

3 140 1 99.29 0.95 6.70 


SD is blank when the number of responses scored twice is less than 10. 


Table E-10. 2014-15 NYSAA: Interrater Consistency Statistics for SILOC Items — 


Mathematics Grade 5 


Mean 


Standard LOC Exact Responses pata Correlation Absolute Piet 
Matches Scored Twice Difference 
1 245 7 97.22 0.93 4.69 
501 2 185 5 97.37 0.98 2.30 
3 158 5 96.93 0.86 5.60 
4 378 7 98.18 0.98 3.00 
502 2 147 10 93.63 0.93 2.58 2.34 
3 78 0 100.00 1.00 0.00 
1 456 8 98.28 0.96 4.43 
503 2 96 1 98.97 1.00 1.50 
3 33 0 100.00 1.00 0.00 
1 342 4 98.84 0.99 2.28 
504 2 214 4 98.17 0.99 2.13 
3 53 2 96.36 1.00 0.15 
1 128 1 99.22 1.00 0.60 
505 2 424 5 98.83 1.00 0.72 
3 54 0 100.00 1.00 0.00 
SD is blank when the number of responses scored twice is less than 10. 
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Table E-11. 2014-15 NYSAA: Interrater Consistency Statistics for SILOC Items — 
Mathematics Grade 6 


Numpenor Percent F Mean) Standard 
Standard LOC Exact Responses Fyn, Correlation Absolute A aviation 
Matches — Scored Twice Difference 
1 304 7 97.75 0.97 3.43 
605 2 222 3 98.67 0.87 7.33 
3 54 0 100.00 1.00 0.00 
1 211 1 99.53 0.99 4.00 
606 2 308 15 95.36 0.97 1.43 1.22 
3 66 3 95.65 0.97 2.17 
1 263 8 97.05 0.99 1.83 
607 2 299 12 96.14 0.92 2.98 2.87 
3 20 0 100.00 1.00 0.00 
1 203 4 98.07 0.99 2.40 
608 2 353 14 96.19 0.98 1.79 0.92 
3 28 0 100.00 1.00 0.00 
ut 340 6 98.27 0.98 3.38 
618 2 132 1 99.25 0.99 2.50 
3 121 2 98.37 0.99 1.60 


SD is blank when the number of responses scored twice is less than 10. 


Table E-12. 2014-15 NYSAA: Interrater Consistency Statistics for SILOC Items — 
Mathematics Grade 7 


8 = NUMRE OF = = es Percent . Mean Standard 
Standard LOC Exact Responses Exact Colelation Absolute A aviation 
Matches Scored Twice Difference 
1 87 3 96.67 0.99 3.00 
705 2 443 4 99.11 1.00 1.95 
3 97 1 98.98 1.00 1.00 
1 269 6 97.82 0.98 2.58 
706 2 212 4 98.15 0.98 2.68 
3 130 2 98.48 1.00 1.05 
1 465 12 97.48 0.97 2.79 2.63 
707 2 125 3 97.66 0.99 1.67 
3 38 0 100.00 1.00 0.00 
1 332 6 98.22 0.99 2.33 
708 2 59 1 98.33 0.99 1.60 
3 236 441 95.55 0.98 1.30 0.79 
1 223 8 96.54 0.97 2.41 
710 2 273 8 97.15 0.98 2.04 
3 126 2 98.44 0.98 2.10 


SD is blank when the number of responses scored twice is less than 10. 
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Table E-13. 2014-15 NYSAA: Interrater Consistency Statistics for SILOC Items — 


Mathematics Grade 8 


Percent : ee Standard 
Standard LOC Exact Responses Exact Correlation Absolute Deviation 
Matches Scored Twice Difference 
1 342 11 96.88 0.98 1.90 1.65 

805 2 228 3 98.70 0.98 2.50 
3 56 2 96.55 0.97 1.85 
1 414 4 99.04 1.00 2.15 
808 2 133 1 99.25 0.95 10.00 
3 84 0 100.00 1.00 0.00 
1 337 7 97.97 0.98 1.93 
809 2 239 3 98.76 0.97 4.60 
3 44 2 95.65 0.62 3.85 
1 326 3 99.09 1.00 0.57 
810 2 144 5 96.64 0.98 1.50 
3 157 1 99.37 0.99 5.00 
1 527 7 98.69 1.00 1.93 
818 2 71 1 98.61 0.83 10.00 
3 27 1 96.43 0.39 10.00 


SD is blank when the number of responses scored twice is less than 10. 


Table E-14. 2014-15 NYSAA: Interrater Consistency Statistics for SILOC Items — 


Mathematics High School 


Mean 


Standard LOC Exact Responses alae Correlation Absolute inate 
Matches Scored Twice Difference 
1 258 3 98.85 1.00 1.57 
911 2 155 1 99.36 1.00 0.90 
3 152 3 98.06 0.99 1.67 
4 110 5 95.65 0.97 2.52 
912 2 381 8 97.94 0.99 2.44 
3 43 2 95.56 0.30 10.00 
1 228 7 97.02 1.00 1.04 
913 2 162 5 97.01 0.98 3.00 
3 158 1 99.37 0.97 5.00 
1 160 3 98.16 0.98 3.97 
914 2 252 11 95.82 0.98 2.34 1.90 
3 141 5 96.58 0.91 3.86 
1 277 4 98.58 0.95 6.18 
915 2 76 0 100.00 1.00 0.00 
3 193 5 97.47 0.93 4.46 
SD is blank when the number of responses scored twice is less than 10. 
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Table E-15. 2014-15 NYSAA: Interrater Consistency Statistics for SILOC Items — 
Science Grade 4 


NUMDEEGE Percent F Mee Standard 
Standard LOC Exact Responses Exact Correlation Absolute Deviation 
Matches Scored Twice Difference 
1 283 3 98.95 1.00 2.23 
411 2 244 4 98.39 0.96 4.55 
3 58 1 98.31 0.79 6.00 
1 254 2 99.22 0.94 10.00 
422 2 269 5 98.18 0.86 4.82 
3 65 1 98.48 0.96 3.00 


SD is blank when the number of responses scored twice is less than 10. 


Table E-16. 2014-15 NYSAA: Interrater Consistency Statistics for SILOC Items — 
Science Grade 8 


Number of Mean 


Standard LOC Exact Responses parte Correlation Absolute eee 
Matches Scored Twice Difference 

1 257 2 99.23 0.99 3.75 

813 2 259 4 98.48 1.00 1.08 
3 117 0 100.00 1.00 0.00 
1 108 2 98.18 0.92 6.65 

832 2 443 14 96.94 0.97 2.36 1.90 
3 82 0 100.00 1.00 0.00 


SD is blank when the number of responses scored twice is less than 10. 


Table E-17. 2014-15 NYSAA: Interrater Consistency Statistics for SILOC Items — 
Science High School 


DUDES! Percent , Mee Standard 
Standard LOC Exact Responses Fy a4, Correlation Absolute A aviation 
Matches — Scored Twice Difference 
4 193 2 98.97 1.00 2.00 
921 2 153 5 96.84 0.98 2.00 
3 219 6 97.33 0.97 2.58 
1 113 5 95.76 0.98 2.86 
931 2 397 6 98.51 0.97 3.10 
3 62 0 100.00 1.00 0.00 


SD is blank when the number of responses scored twice is less than 10. 
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Table E-18. 2014-15 NYSAA: Interrater Consistency Statistics for SILOC Items — 
Social Studies High School 


Number@t Percent : Mea Standard 
Standard LOC Exact Responses Fyn, Correlation Absolute A aviation 
Matches — Scored Twice Difference 
1 128 0 100.00 1.00 0.00 
911 2 343 3 99.13 0.99 2.47 
3 107 3 97.27 0.94 1.50 
1 90 0 100.00 1.00 0.00 
921 2 387 5 98.72 0.99 2.44 
3 85 2 97.70 0.85 3.75 


SD is blank when the number of responses scored twice is less than 10. 
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APPENDIX F—PERFORMANCE LEVEL 
DISTRIBUTIONS 
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Table F-1. 2014-15 NYSAA: Performance Level Distributions 
by Content Area and Grade 


Performance Percent at Level 


Content Area Grade 


Level’ 2014-15 
21 5.21 
3 22 14.21 
23 63.72 
24 16.87 
21 2.70 
4 22 11.66 
23 72.80 
24 12.84 
21 2.85 
5 22 9.53 
23 69.94 
24 17.68 
natch 21 4.08 
ae 6 ae ee 
hires 23 73.29 
24 11.83 
21 6.19 
7 22 11.40 
23 62.09 
24 20.32 
21 9.01 
9 22 9.89 
23 59.09 
24 22.01 
21 5.86 
High 22 17.24 
School 23 60.00 
24 16.90 
21 8.12 
3 22 13.47 
23 62.35 
24 16.06 
21 7.54 
4 22 14.52 
23 58.27 
24 19.67 
; 21 5.24 
Mathematics : 59 10.64 
23 66.67 
24 17.45 
21 5.04 
6 22 16.46 
23 53.19 
24 25.30 
7 21 4.67 
22 15.01 
continued 
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Performance Percent at Level 


Content Area Grade 


Level’ 2014-15 
: 33 63.14 
24 17.18 
1 8.08 
: 22 15.21 
| 23 59.99 
Mathematics 24 16.72 
1 5.51 
High 22 13.37 
School 23 60.32 
24 20.81 
mT 3.89 
; 22 12.02 
23 70.50 
24 13.58 
1 3.13 
| 22 12.24 
Science 8 23 69.57 
24 15.05 
1 3.07 
High 22 14.02 
School 23 96.32 
24 26.59 
4 4.57 
Social Studies pee - be 
24 18.43 


"21 = Not Meeting Learning Standards, 22 = Partially Meeting 
Learning Standards, 23 = Meeting Learning Standards, 
24 = Meeting Learning Standards with Distinction 
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APPENDIX G—CUMULATIVE DISTRIBUTION 
GRAPHS 
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Figure G-1. 2014-15 NYSAA: Cumulative Score Distributions 


Top: English Language Arts Grade 3 Bottom: English Language Arts Grade 4 


Cumulative Scale Score Distributions: English Language Arts Grade 3 
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Cumulative Scale Score Distributions: English Language Arts Grade 4 
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— 2015 


Figure G-2. 2014-15 NYSAA: Cumulative Score Distributions 


Top: English Language Arts Grade 5 Bottom: English Language Arts Grade 6 


Cumulative Scale Score Distributions: English Language Arts Grade 5 
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Cumulative Scale Score Distributions: English Language Arts Grade 6 
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525 


— 2015 


Figure G-3. 2014-15 NYSAA: Cumulative Score Distributions 


Top: English Language Arts Grade 7 Bottom: English Language Arts Grade 8 


Cumulative Scale Score Distributions: English Language Arts Grade 7 
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Cumulative Scale Score Distributions: English Language Arts Grade 8 
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— 2015 


Figure G-4, 2014-15 NYSAA: Cumulative Score Distributions 
Top: English Language Arts High School Bottom: Mathematics Grade 3 


Cumulative Scale Score Distributions: English Language Arts Grade HS 


— 2015 


re 
7 
+! 


Cumulative Proportion 


400 425 450 475 500 525 
Scale Score 


Cumulative Scale Score Distributions: Mathematics Grade 3 
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Figure G-5. 2014-15 NYSAA: Cumulative Score Distributions 
Top: Mathematics Grade 4 Bottom: Mathematics Grade 5 


Cumulative Scale Score Distributions: Mathematics Grade 4 
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Cumulative Scale Score Distributions: Mathematics Grade 5 
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Figure G-6. 2014-15 NYSAA: Cumulative Score Distributions 
Top: Mathematics Grade 6 Bottom: Mathematics Grade 7 


Cumulative Scale Score Distributions: Mathematics Grade 6 
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Figure G-7. 2014-15 NYSAA: Cumulative Score Distributions 
Top: Mathematics Grade 8 Bottom: Mathematics High School 


Cumulative Scale Score Distributions: Mathematics Grade 8 
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Cumulative Scale Score Distributions: Mathematics Grade HS 
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Figure G-8. 2014-15 NYSAA: Cumulative Score Distributions 
Top: Science Grade 4 Bottom: Science Grade 8 


Cumulative Scale Score Distributions: Science Grade 4 
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Cumulative Scale Score Distributions: Science Grade 8 
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Figure G-9. 2014-15 NYSAA: Cumulative Score Distributions 
Top: Science High School Bottom: Social Studies High School 


Cumulative Scale Score Distributions: Science Grade HS 
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Cumulative Scale Score Distributions: Social Studies Grade HS 
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APPENDIX H—CLASSICAL ITEM ANALYSIS 
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Table H-1. 2014-15 NYSAA: Classical Test Theory Statistics for S/LOC Items 


English Language Arts Grade 3 


Maximum 


Standard LOC Scare N Mean SD Discrimination Difficulty 
1 100 374 74.21 34.96 0.67 0.74 
314 2 100 2,022 87.35 23.96 0.49 0.87 
3 100 311 92.44 17.81 0.28 0.92 
1 100 1,942 77.14 31.26 0.60 0.77 
322 2 100 601 86.27 22.99 0.39 0.86 
3 100 158 93.51 17.56 0.44 0.94 
fl 100 1,492 83.37 27.95 0.69 0.83 
331 2 100 788 91.49 16.86 0.40 0.91 
3 100 447 92.94 17.90 0.28 0.93 
1 100 675 74.55 33.99 0.72 0.75 
341 2 100 1,536 87.33 26.46 0.34 0.87 
3 100 505 93.61 15.40 0.40 0.94 
1 100 1,370 76.66 28.78 0.69 0.77 
351 2 100 944 87.06 24.81 0.36 0.87 
3 100 412 92.21 17.91 0.29 0.92 


Table H-2. 2014-15 NYSAA: Classical Test Theory Statistics for S/LOC Items 


English Language Arts Grade 4 


Standard LOC ee N Mean SD Discrimination Difficulty 
1 100 1,662 82.97 27.52 0.54 0.83 
411 2 100 1,172 91.74 16.42 0.36 0.92 
3 100 112 95.78 14.78 0.22 0.96 
1 100 1,373 84.10 27.80 0.60 0.84 
413 2 100 1,109 91.62 18.06 0.42 0.92 
3 100 437 88.17 19.23 0.19 0.88 
HE 100 2,002 77.65 25.66 0.54 0.78 
432 2 100 573 86.23 23.04 0.43 0.86 
3 100 354 93.78 15.91 0.20 0.94 
1 100 766 80.55 27.42 0.52 0.81 
442 2 100 2,016 88.82 20.11 0.43 0.89 
3 100 140 91.56 14.43 0.08 0.92 
1 100 500 74.00 33.29 0.66 0.74 
453 2 100 2,108 85.52 20.45 0.42 0.86 
3 100 321 92.73 15.98 0.32 0.93 
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Table H-3. 2014-15 NYSAA: Classical Test Theory Statistics for S/LOC Items 


English Language Arts Grade 5 


Standard Loc gesuing Mean SD Discrimination Difficulty 
1 100 1,854 80.47 25.52 0.53 0.80 
512 2 100 1,014 88.04 18.41 0.40 0.88 
3 100 214 86.04 23.55 0.41 0.86 
1 100 1,209 80.51 29.09 0.61 0.81 
523 2 100 1,804 88.70 17.80 0.43 0.89 
3 100 77 96.73 11.94 0.16 0.97 
1 100 790 78.48 31.92 0.64 0.78 
533 2 100 2,200 88.60 20.87 0.39 0.89 
3 100 87 92.75 16.60 0.28 0.93 
1 100 930 84.10 29.13 0.55 0.84 
541 2 100 1,939 87.05 18.50 0.42 0.87 
3 100 225 94.57 16.05 0.19 0.95 
1 100 1,458 79.21 27.49 0.58 0.79 
552 2 100 1,357 88.08 20.26 0.37 0.88 
3 100 246 86.61 24.69 0.33 0.87 


Table H-4. 2014-15 NYSAA: Classical Test Theory Statistics for S/LOC Items 


English Language Arts Grade 6 


Standard Loc oy N Mean SD Discrimination Difficulty 
1 100 1,236 79.75 32.72 0.55 0.80 
611 2 100 1,640 91.65 19.48 0.40 0.92 
3 100 289 94.50 14.88 0.17 0.94 
at 100 1,369 85.05 30.77 0.57 0.85 
621 2 100 1,626 91.23 17.30 0.32 0.91 
3 100 174 98.04 8.05 0.36 0.98 
1 100 702 78.33 31.87 0.64 0.78 
631 2 100 2,381 88.97 16.90 0.42 0.89 
3 100 76 96.21 13.51 0.36 0.96 
1 100 1,609 85.53 27.21 0.59 0.86 
641 2 100 1,388 90.54 16.96 0.39 0.91 
3 100 153 96.92 11.68 0.22 0.97 
ut 100 1,255 78.19 31.49 0.60 0.78 
651 2 100 1,369 88.38 16.59 0.37 0.88 
3 100 556 90.85 16.48 0.35 0.91 
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Table H-5. 2014-15 NYSAA: Classical Test Theory Statistics for S/LOC Items 


Maximum 


English Language Arts Grade 7 


Standard LOC es N Mean SD Discrimination Difficulty 
1 100 2,387 86.70 26.98 0.45 0.87 
713 2 100 566 89.94 19.21 0.31 0.90 
3 100 345 93.08 15.76 0.39 0.93 
1 100 2,114 81.88 26.47 0.53 0.82 
724 2 100 564 89.48 18.60 0.32 0.89 
3 100 630 92.18 15.25 0.29 0.92 
1 100 2,170 84.86 27.18 0.48 0.85 
732 2 100 966 88.30 20.91 0.32 0.88 
3 100 157 96.25 9.74 0.07 0.96 
1 100 1,231 82.51 31.85 0.58 0.83 
741 2 100 1,869 88.78 21.56 0.34 0.89 
3 100 209 88.60 24.04 0.38 0.89 
1 100 1,758 80.28 26.38 0.58 0.80 
753 2 100 289 87.44 19.43 0.38 0.87 
3 100 1,278 91.34 15.55 0.31 0.91 


Table H-6. 2014-15 NYSAA: Classical Test Theory Statistics for S/LOC Items 


Maximum 


English Language Arts Grade 8 


Standard LOC Score N Mean SD Discrimination Difficulty 
1 100 2,069 78.92 29.50 0.54 0.79 
822 2 100 1,029 91.07 21.98 0.38 0.91 
3 100 242 93.74 14.25 0.27 0.94 
1 100 2,237 85.23 28.26 0.55 0.85 
823 2 100 626 87.39 23.80 0.39 0.87 
3 100 459 93.38 16.17 0.39 0.93 
1 100 1,899 82.75 31.49 0.59 0.83 
833 2 100 1,165 89.68 21.61 0.41 0.90 
3 100 276 95.53 11.75 0.34 0.96 
HE 100 1,694 83.74 24.08 0.52 0.84 
842 2 100 1,249 87.66 19.38 0.34 0.88 
3 100 351 93.13 14.94 0.37 0.93 
a 100 2,102 81.21 27.45 0.56 0.81 
852 2 100 833 86.85 17.89 0.25 0.87 
3 100 427 94.08 13.99 0.35 0.94 
2014-15 NYSAA Technical Report: Appendix H—Classical Item Analysis -5- 


Table H-7. 2014-15 NYSAA: Classical Test Theory Statistics for S/LOC Items 
English Language Arts High School 


Maximum 


Standard LOC Gcore Mean SD Discrimination Difficulty 
1 100 749 75.51 35.71 0.57 0.76 
911 2 100 1,587 85.49 23.18 0.38 0.85 
3 100 504 88.37 22.74 0.35 0.88 
1 100 1,343 81.01 27.49 0.61 0.81 
921 2 100 1,148 84.69 24.51 0.44 0.85 
3 100 356 92.83 17.90 0.56 0.93 
1 100 969 80.85 27.56 0.60 0.81 
931 2 100 1,557 91.37 17.55 0.27 0.91 
3 100 324 92.65 15.64 0.56 0.93 
1 100 2,013 83.09 26.75 0.53 0.83 
942 2 100 428 87.65 18.33 0.38 0.88 
3 100 401 89.74 17.36 0.44 0.90 
1 100 1,127 75.87 28.67 0.61 0.76 
951 2 100 1,211 80.89 22.39 0.25 0.81 
3 100 519 91.30 16.86 0.32 0.91 


Table H-8. 2014-15 NYSAA: Classical Test Theory Statistics for S/LOC Items 
Mathematics Grade 3 


Maximum 


Standard LOC Score N Mean SD Discrimination Difficulty 
1 100 1,756 78.23 29.55 0.67 0.78 
301 2 100 542 83.28 21.86 0.55 0.83 
3 100 425 89.30 19.55 0.42 0.89 
1 100 893 73.05 33.06 0.66 0.73 
302 2 100 1,243 85.16 22.66 0.44 0.85 
3 100 587 91.63 15.38 0.38 0.92 
4 100 1,527 82.58 26.86 0.63 0.83 
303 2 100 1,080 87.85 19.88 0.51 0.88 
3 100 112 91.16 17.75 0.44 0.91 
HE 100 809 73.04 34.38 0.68 0.73 
304 2 100 1,821 86.33 22.26 0.48 0.86 
3 100 92 92.14 17.89 0.17 0.92 
ut 100 590 73.37 34.21 0.75 0.73 
305 2 100 1,600 85.22 23.35 0.57 0.85 
3 100 516 92.96 17.10 0.43 0.93 
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Table H-9. 2014-15 NYSAA: Classical Test Theory Statistics for S/LOC Items 
Mathematics Grade 4 


Maximum 


Standard LOC es N Mean SD Discrimination Difficulty 
1 100 1,490 77.14 28.78 0.54 0.77 
401 2 100 952 85.02 24.85 0.34 0.85 
3 100 498 91.20 18.33 0.22 0.91 
1 100 1,585 72.22 31.84 0.63 0.72 
402 2 100 316 80.79 24.24 0.44 0.81 
3 100 1,031 87.87 18.91 0.43 0.88 
1 100 2,055 81.68 27.04 0.55 0.82 
403 2 100 804 87.60 21.79 0.34 0.88 
3 100 84 94.64 15.02 0.35 0.95 
1 100 1,615 80.14 27.86 0.58 0.80 
404 2 100 1,023 88.05 18.56 0.41 0.88 
3 100 277 91.57 14.29 0.27 0.92 
1 100 895 81.05 27.97 0.60 0.81 
405 2 100 1,383 85.44 22.43 0.49 0.85 
3 100 642 90.24 20.44 0.34 0.90 


Table H-10. 2014-15 NYSAA: Classical Test Theory Statistics for S/LOC Items 
Mathematics Grade 5 


Standard LOC oe N Mean SD Discrimination Difficulty 
Al 100 1,387 79.16 29.63 0.70 0.79 
501 2 100 897 84.64 24.71 0.30 0.85 
3 100 777 88.25 21.63 0.28 0.88 
1 100 2,036 79.21 28.28 0.63 0.79 
502 2 100 743 85.34 23.39 0.26 0.85 
3 100 324 91.00 19.22 0.30 0.91 
1 100 2,449 81.28 28.73 0.60 0.81 
503 2 100 440 86.18 24.96 0.46 0.86 
3 100 155 87.50 22.17 0.53 0.88 
d 100 1,817 83.28 25.83 0.65 0.83 
504 2 100 1,043 86.64 20.76 0.31 0.87 
3 100 242 89.44 21.91 0.40 0.89 
1 100 748 76.61 30.74 0.71 0.77 
505 2 100 2,088 91.09 16.72 0.37 0.91 
3 100 245 90.60 21.57 0.38 0.91 
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Table H-11. 2014-15 NYSAA: Classical Test Theory Statistics for S/LOC Items 
Mathematics Grade 6 


Maximum 


Standard LOC core Mean SD Discrimination Difficulty 
1 100 1,712 83.53 27.22 0.59 0.84 
605 2 100 1,183 89.60 21.64 0.33 0.90 
3 100 267 93.78 16.39 0.40 0.94 
1 100 1,227 79.11 27.27 0.63 0.79 
606 2 100 1,531 91.08 18.47 0.33 0.91 
3 100 403 90.87 18.71 0.36 0.91 
1 100 1,507 77.09 31.61 0.64 0.77 
607 2 100 1,524 89.61 21.65 0.36 0.90 
3 100 108 93.75 15.93 0.56 0.94 
1 100 1,235 78.06 28.38 0.67 0.78 
608 2 100 1,782 87.09 19.41 0.39 0.87 
3 100 155 94.92 14.29 0.35 0.95 
1 100 1,845 76.29 30.19 0.61 0.76 
618 2 100 646 89.27 20.63 0.50 0.89 
3 100 680 89.89 18.42 0.42 0.90 


Table H-12. 2014-15 NYSAA: Classical Test Theory Statistics for SILOC Items 
Mathematics Grade 7 


Standard LOC eee N Mean SD Discrimination Difficulty 
1 100 537 69.65 38.34 0.52 0.70 
705 2 100 2,353 84.90 22.68 0.49 0.85 
3 100 431 92.88 14.58 0.36 0.93 
1 100 1,522 75.19 30.76 0.56 0.75 
706 2 100 1,144 85.07 24.04 0.48 0.85 
3 100 628 91.11 18.42 0.33 0.91 
1 100 2,557 78.96 31.00 0.54 0.79 
707 2 100 570 83.85 24.76 0.52 0.84 
3 100 181 92.86 14.90 0.32 0.93 
ME 100 1,836 79.71 28.48 0.60 0.80 
708 2 100 317 87.26 20.63 0.38 0.87 
3 100 1,165 88.66 17.33 0.42 0.89 
at 100 1,292 78.53 29.29 0.65 0.79 
710 2 100 1,381 83.84 22.27 0.39 0.84 
3 100 644 95.61 14.48 0.45 0.96 
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Table H-13. 2014-15 NYSAA: Classical Test Theory Statistics for S/LOC Items 
Mathematics Grade 8 


Maximum 


Standard LOC ees N Mean SD Discrimination Difficulty 
1 100 1,964 83.04 27.14 0.62 0.83 
805 2 100 1,102 87.67 20.45 0.35 0.88 
3 100 279 93.46 17.97 0.36 0.93 
1 100 2,258 84.30 29.12 0.59 0.84 
808 2 100 713 84.16 31.31 0.47 0.84 
3 100 379 92.15 18.29 0.51 0.92 
1 100 1,932 76.46 30.92 0.63 0.76 
809 2 100 1,173 88.40 21.77 0.36 0.88 
3 100 213 94.53 16.11 0.44 0.95 
1 100 1,866 81.42 31.43 0.64 0.81 
810 2 100 686 90.50 19.86 0.25 0.91 
3 100 763 86.46 29.71 0.41 0.86 
1 100 2,821 78.36 29.91 0.55 0.78 
818 2 100 376 92.50 18.05 0.31 0.93 
3 100 163 93.09 15.15 0.33 0.93 


Table H-14, 2014-15 NYSAA: Classical Test Theory Statistics for SILOC Items 
Mathematics High School 


Standard Loc ee N Mean SD Discrimination Difficulty 
1 100 1,357 80.16 27.21 0.58 0.80 
911 2 100 747 88.84 17.73 0.38 0.89 
3 100 743 89.02 21.07 0.39 0.89 
1 100 643 68.04 36.22 0.60 0.68 
912 2 100 1,889 81.72 25.83 0.44 0.82 
3 100 276 95.39 13.58 0.42 0.95 
1 100 1,260 76.44 32.22 0.63 0.76 
913 2 100 775 79.98 30.95 0.51 0.80 
3 100 788 90.73 20.74 0.45 0.91 
A 100 958 76.27 32.00 0.54 0.76 
914 2 100 1,159 77.63 28.07 0.32 0.78 
3 100 721 91.30 21.12 0.33 0.91 
1 100 1,494 78.94 29.42 0.54 0.79 
915 2 100 478 84.92 27.41 0.35 0.85 
3 100 860 88.05 21.79 0.47 0.88 
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Table H-15. 2014-15 NYSAA: Classical Test Theory Statistics for S/LOC Items 
Science Grade 4 


Maximum 


Standard LOC core Mean SD Discrimination Difficulty 
1 100 1,405 81.00 27.21 0.46 0.81 
411 2 100 1,205 89.45 21.82 0.27 0.89 
3 100 292 95.26 11.62 0.08 0.95 
1 100 1,289 83.76 26.33 0.42 0.84 
422 2 100 1,309 92.97 16.12 0.24 0.93 
3 100 324 91.79 19.53 0.30 0.92 


Table H-16. 2014-15 NYSAA: Classical Test Theory Statistics for SILOC Items 


Maximum 


Science Grade 8 


Standard LOC Score N Mean SD Discrimination Difficulty 
1 100 1,415 82.16 28.51 0.50 0.82 
813 2 100 1,320 87.27 20.38 0.25 0.87 
3 100 592 93.51 15.08 0.22 0.94 
1 100 743 79.34 29.28 0.46 0.79 
832 2 100 2,190 86.04 21.29 0.36 0.86 
3 100 404 93.35 14.14 0.29 0.93 


Table H-17. 2014-15 NYSAA: Classical Test Theory Statistics for S/LOC Items 
Science High School 


Maximum 


Standard Loc Scare Mean SD Discrimination Difficulty 
1 100 1,058 81.27 26.33 0.48 0.81 
921 2 100 777 85.86 22.05 0.35 0.86 
3 100 1,020 91.05 19.95 0.28 0.91 
1 100 654 78.13 29.92 0.47 0.78 
931 2 100 1,909 87.44 23.86 0.37 0.87 
3 100 294 94.81 14.95 0.25 0.95 


Table H-18. 2014-15 NYSAA: Classical Test Theory Statistics for SILOC Items 


Social Studies High School 


Standard LOC eee N Mean SD Discrimination Difficulty 
1 100 737 75.25 31.44 0.46 0.75 
911 2 100 1,662 87.54 24.34 0.34 0.88 
3 100 457 93.91 12.90 0.31 0.94 
1 100 528 67.32 41.30 0.54 0.67 
921 2 100 1,924 85.64 24.90 0.38 0.86 
3 100 388 94.94 14.40 0.30 0.95 
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Table H-19. 2014-15 NYSAA: Classical Test Theory Statistics for Standards-Based Items 


Subject Grade Item ee N Mean SD Difficulty Discrimination 
314 25 2,707 15.94 4.94 0.64 0.63 
322 25 2,701 10.56 5.68 0.42 0.71 
3 331 25 2,727 13.35 6.49 0.53 0.73 
341 25 2,716 15.56 6.19 0.62 0.67 
351 25 2,726 13.13 6.61 0.53 0.73 
411 25 2,946 12.25 5.27 0.49 0.68 
413 25 2,919 13.85 6.10 0.55 0.60 
4 432 25 2,929 11.41 6.29 0.46 0.61 
442 25 2,922 14.57 4.76 0.58 0.60 
453 25 2,929 15.48 5.00 0.62 0.65 
512 25 3,082 11.84 5.50 0.47 0.60 
523 25 3,090 13.32 4.97 0.53 0.63 
5 533 25 3,077 14.40 4.73 0.58 0.63 
541 25 3,094 14.46 5.01 0.58 0.61 
552 25 3,061 12.90 5.67 0.52 0.57 
611 25 3,165 13.98 5.85 0.56 0.66 
English 621 25 3,169 13.57 5.33 0.54 0.71 
Language 6 631 25 3,159 14.69 4.41 0.59 0.66 
Arts 641 25 3,150 12.86 5.24 0.51 0.69 
651 25 3,180 14.33 6.38 0.57 0.67 
713 25 3,298 11.65 5.78 0.47 0.66 
724 25 3,308 12.65 6.80 0.51 0.64 
7 732 25 3,293 11.56 5.22 0.46 0.68 
741 25 3,309 13.83 5.33 0.55 0.62 
753 25 3,325 14.93 7.93 0.60 0.65 
822 25 3,340 11.77 5.92 0.47 0.71 
823 25 3,322 12.16 6.27 0.49 0.71 
8 833 25 3,340 12.48 5.92 0.50 0.76 
842 25 3,294 13.06 5.77 0.52 0.67 
852 25 3,362 12.19 6.25 0.49 0.71 
911 25 2,840 15.19 6.03 0.61 0.71 
921 25 2,847 13.30 6.09 0.53 0.79 
High School 931 25 2,850 14.60 5.65 0.58 0.76 
942 25 2,842 11.72 6.21 0.47 0.73 
951 25 2,857 13.98 6.47 0.56 0.77 
301 25 2,723 11.93 6.60 0.48 0.70 
Mathematics 3 302 25 2,723 14.92 6.67 0.60 0.74 
303 25 2,719 12.10 5.19 0.48 0.76 


continued 
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Subject Grade Item ee N Mean SD Difficulty Discrimination 
3 304 25 2,/22 13.78 5.18 0.55 0.71 
305 25 2,706 15.71 5.98 0.63 0.69 
401 25 2,940 13.18 6.69 0.53 0.74 
402 25 2,932 13.95 8.13 0.56 0.67 
4 403 25 2,943 10.84 4.98 0.43 0.70 
404 25 2,915 12.46 5.90 0.50 0.74 
405 25 2,920 15.37 6.21 0.61 0.65 
501 25 3,061 14.31 7.02 0.57 0.67 
502 25 3,103 11.55 6.07 0.46 0.71 
5 503 25 3,044 10.08 5.06 0.40 0.70 
504 25 3,102 12.18 5.53 0.49 0.71 
505 25 3,081 15.03 5.09 0.60 0.60 
605 25 3,162 12.74 5.75 0.51 0.74 
606 25 3,161 14.19 5.97 0.57 0.77 
6 607 25 3,139 12.53 5.54 0.50 0.78 
608 25 3,172 13.34 5.29 0.53 0.80 
Mathematics 618 25 3,171 12.93 7.21 0.52 0.70 
705 25 3,321 15.61 5.32 0.62 0.61 
706 25 3,294 13.63 6.87 0.55 0.66 
7 707 25 3,308 10.17 5.40 0.41 0.67 
708 25 3,318 14.34 7.79 0.57 0.67 
710 25 3,317 14.44 6.59 0.58 0.62 
805 25 3,345 12.27 5.71 0.49 0.74 
808 25 3,350 11.81 6.07 0.47 0.72 
8 809 25 3,318 11.80 5.92 0.47 0.76 
810 25 3,315 13.45 7.08 0.54 0.68 
818 25 3,360 9.63 5.18 0.39 0.70 
911 25 2,847 14.36 7.04 0.57 0.70 
912 25 2,808 14.51 5.66 0.58 0.63 
High School 913 25 2,823 14.39 7.44 0.58 0.65 
914 25 2,838 14.94 6.88 0.60 0.70 
915 25 2,832 14.09 7.54 0.56 0.71 
A 411 25 2,902 13.22 5.96 0.53 0.56 
422 25 2,922 13.90 5.80 0.56 0.54 
eannce 8 813 25 3,327 14.27 6.38 0.57 0.51 
832 25 3,337 15.28 5.25 0.61 0.53 
High School 921 25 2,855 16.00 7.19 0.64 0.62 
931 25 2,857 15.16 5.31 0.61 0.62 
; 911 25 2,856 15.30 5.97 0.61 0.63 
Social owdles< JPIgH/SEn0o) 921 25 2,840 15.48 5.76 0.62 0.66 
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APPENDIX I—CORRELATIONS BETWEEN 
STANDARDS-BASED ITEMS 
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Table I-1. 2014-15 NYSAA: Standards-Based Item Correlations for Grade 3 
Pair of 


Subject eranAnide N Correlation 


314 322 2,669 0.52 
314 331 2,690 0.53 
314 341 2,681 0.57 
314 351 2,688 0.52 


ee 322 331 2,685 0.62 
hades 322 341 2,672 0.55 


322 351 2,682 0.66 
331 341 2,699 0.58 
331 351 2,708 0.68 
341 351 2,699 0.56 
301 302 2,702 0.61 
301 303 2,698 0.68 
301 304 2,701 0.56 
301 305 2,686 0.55 
302 303 2,696 0.66 
302 304 2,701 0.64 
302 305 2,687 0.61 
303 304 2,696 0.63 
303 305 2,680 0.60 
304 305 2,684 0.60 


Mathematics 


Table I-2. 2014-15 NYSAA: Standards-Based Item Correlations for Grade 4 


; Pair of ; 
Subject Standards N Correlation 


411 413 2,902 0.55 
411 432 2,911 0.59 
411 442 2,906 0.50 
411 453 2,911 0.53 


pe 413. «432~—S—s« 2,884 0.44 
pig 413. 442~—=s«2,8811 0.47 


413 453 2,883 0.50 
432 442 =2,887 0.44 
432 453 2,895 0.50 
442 453 2,887 0.56 
401 402 2,906 0.60 
401 403 2,915 0.65 
401 404 2,889 0.67 
401 405 2,895 0.57 
402 403 2,910 0.55 
402 404 2,882 0.60 
402 405 2,888 0.51 
403 404 2,892 0.64 
403 405 2,897 0.54 
404 405 2,874 0.59 
Science 411 422 2,871 0.59 


Mathematics 
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Table I-3. 2014-15 NYSAA: Standards-Based Item Correlations for Grade 5 
Pair of 


Subject ‘Sendande N Correlation 


512 523 3,045 0.51 
512 533 = 3,033 0.47 
512 541 3,051 0.49 
512 552 = 3,020 0.47 


ee 523 533 ~=—«3,041 0.58 
meee 523 541 3,060 0.49 


523 552 -3,027 0.44 
533 541 3,048 0.50 
533 552 = 3,013 0.44 
541 552. 3,031 0.49 
501 502 = 3,037 0.58 
501 503 =. 2,982 0.55 
501 504 83,036 0.58 
501 505 3,017 0.53 
502 503 = 3,023 0.67 
502 504 3,080 0.63 
502 505 3,057 0.50 
503 504 =3,023 0.61 
503 505 = 3,003 0.48 
504 505 3,060 0.53 


Mathematics 


Table I-4. 2014-15 NYSAA: Standards-Based Item Correlations for Grade 6 
Subject Ss nee N Correlation 

611 621 3,127 0.57 

611 631 3,117 0.57 

611 641 3,107 0.56 

611 651 3,136 0.55 


ee 621 631 3,121 0.55 
Rae 621 641 3,115 0.61 


621 651 3,138 0.60 
631 641 3,106 0.55 
631 651 3,130 0.56 
641 651 3,124 0.58 
605 606 3,115 0.66 
605 607 3,092 0.69 
605 608 3,126 0.67 
605 618 3,126 0.63 
606 607 3,088 0.70 
606 608 3,123 0.73 
606 618 3,124 0.62 
607 608 3,108 0.72 
607 618 3,099 0.64 
608 618 3,136 0.64 


Mathematics 
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Table I-5. 2014-15 NYSAA: Standards-Based Item Correlations for Grade 7 
Subject Ss een N Correlation 

713 724 3,249 0.57 

713 732 3,238 0.58 

713 741 3,255 0.50 

713 753 3,267 0.51 


ae 724 732 3,246 0.55 
Rae? 724 741 3,261 0.46 


724 753 (3,275 0.53 
732 741 3,249 0.55 
732 753 = 3,259 0.54 
741 753 3,279 0.53 
705 706 = 3,262 0.48 
705 707 =3,269 0.53 
705 708 83,278 0.49 
705 710 =—-33,281 0.50 
706 707 33,241 0.58 
706 708 3,254 0.57 
706 710) =. 33,258 0.49 
707 708 3,268 0.56 
707 710 = 33,263 0.51 
708 710 = 3,275 0.54 


Mathematics 


Table I-6. 2014-15 NYSAA: Standards-Based Item Correlations for Grade 8 
Pair of 


Subject Standards N Correlation 


822 823 = 3,272 0.61 
822 833 3,288 0.65 
822 842 3,247 0.56 
822 852 3,311 0.61 


ee 823 833 3,273 0.66 
ree 823 842 3,229 0.57 


823 852 = 3,292 0.62 
833 842 3,243 0.60 
833 852 3,306 0.65 
842 852 3,263 0.58 
805 808 3,307 0.64 
805 809 3,279 0.65 
805 810 3,274 0.61 
805 818 3,315 0.62 
808 809 = 3,283 0.64 
808 810 §=3,281 0.57 
808 818 3,324 0.61 
809 810 3,252 0.63 
809 818 3,290 0.65 
810 818 3,288 0.55 
Science 813 832 3,282 0.55 


Mathematics 
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Table I-7. 2014-15 NYSAA: Standards-Based Item Correlations for High School 


. Pair of : 
Subject Srandarde N Correlation 


911 921 2,795 0.63 

911 931 2,796 0.65 

911 942 =2,790 0.59 

911 951 2,802 0.63 

English 921 931 2,801 0.68 
Language Arts 921 942 2,797 0.71 
921 951 2,807 0.72 
931 942 2,797 0.63 
931 951 2,815 0.69 
942 951 2,802 0.64 
911 912 2,761 0.54 
911 913 2,774 0.54 
911 914 2,791 0.63 
911 915 2,783 0.65 
912 913 2,734 0.52 
912 914 2,756 0.59 
912 915 2,750 0.52 
913 914 2,766 0.53 
913 915 2,764 0.58 
914 915 2,773 0.62 
Science 921 931 2,816 0.65 
Social Studies 911 921 2,810 0.67 


Mathematics 
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